Enable job alerts via email!

Associate Data Engineer (Databricks) (Ref 26210)

JOBLINE RESOURCES PTE. LTD.

Singapore

On-site

SGD 90,000 - 130,000

Full time

Today
Be an early applicant

Job summary

A leading data consultancy in Singapore seeks an experienced professional to oversee Databricks platform operations, ensuring data quality, governance, and compliance. The ideal candidate has 8-10 years in system operations, extensive knowledge of Databricks, and AWS certification. This role is crucial for optimizing data processes and mentoring junior team members.

Qualifications

  • 8-10 years in system operations compliance and management.
  • Hands-on experience with Databricks platform.
  • Must be cloud certified (AWS).
  • Expert-level proficiency in Databricks.
  • Strong knowledge of monitoring and incident management.

Responsibilities

  • Build and optimize ETL/ELT processes for large data volumes.
  • Implement data quality frameworks to ensure data accuracy.
  • Monitor and maintain production data pipelines for uptime.
  • Establish best practices for data governance and compliance.
  • Provide technical guidance and mentorship to junior members.

Skills

Databricks
AWS cloud services
Apache Spark
Delta Lake
Data warehouse concepts
CI/CD pipelines
Monitoring
Incident management

Education

Degree in Computer Science or Computer Engineering

Tools

Databricks Unity Catalog
Tableau
Oracle Database
Job description
Responsibilities
  • Build and optimize ETL/ELT processes leveraging Databricks' native capabilities to handle large volumes of structured and unstructured data from various sources
  • Implement data quality frameworks and monitoring solutions using Databricks data quality features to ensure data accuracy and reliability across all data products
  • Establish best practices for data governance, security, and compliance within the Databricks ecosystem and integrate with enterprise systems
  • Monitor and maintain production data pipelines to ensure 99.9% uptime and optimal performance across all Databricks workloads and clusters
  • Implement comprehensive logging, alerting, and monitoring systems using Databricks monitoring capabilities and integration with enterprise monitoring tools
  • Perform regular health checks on Databricks cluster performance, job execution times, and resource utilization to identify and resolve bottlenecks proactively
  • Manage incident response procedures for Databricks pipeline failures, including root cause analysis, resolution, and post-incident reviews
  • Establish and maintain disaster recovery procedures and backup strategies for critical data assets within the Databricks environment
  • Conduct regular performance tuning of Spark jobs and Databricks cluster configurations to optimize cost and execution efficiency
  • Implement automated testing frameworks for Databricks-based data pipelines, including unit tests, integration tests, and data validation checks
  • Maintain comprehensive documentation for all Databricks operational procedures, runbooks, and troubleshooting guides
  • Coordinate scheduled maintenance windows and Databricks system upgrades with minimal business impact
  • Manage user access controls, workspace configurations, and security policies within Databricks environments
  • Monitor data lineage using Databricks Unity Catalog and maintain metadata management systems to support operational transparency and compliance requirements
  • Establish capacity planning processes to forecast Databricks infrastructure needs and manage cloud costs effectively
  • Provide technical guidance and mentorship to junior team members on Databricks best practices and data engineering principles
  • Participate in on-call rotation for critical production systems with focus on Databricks platform stability
  • Lead operational reviews and contribute to continuous improvement initiatives for Databricks platform reliability and efficiency
  • Coordinate with infrastructure teams on Databricks cluster provisioning, network configurations, and security implementations
Requirements
  • Degree in Computer Science or Computer Engineering
  • Minimum 8-10 years working experience in system operations compliance and management areas
  • Project hands-on experience specifically with Databricks platform (primary requirement)
  • Project experience in cloud operations or cloud architecture
  • Must be cloud certified (AWS)
  • Databricks certification (Associate or Professional level) - highly preferred
  • Exposure to hospital information/clinical systems is an added advantage
  • Understanding of DevOps practices and CI/CD pipelines for Databricks-based data engineering projects
  • Knowledge of ITIL frameworks and operational best practices
  • Expert-level proficiency in Databricks platform, including workspace management, cluster configuration, and job orchestration
  • Strong expertise in Apache Spark within Databricks environment, including Spark SQL, DataFrames, and RDDs
  • Extensive experience with Delta Lake, including data versioning, time travel, and ACID transactions
  • Proficiency in Databricks Unity Catalog for data governance and metadata management
  • Good in-depth understanding of data warehouse concepts, data profiling, data verification and advanced analytics techniques
  • Strong knowledge of monitoring, incident management, and cloud cost control
  • Databricks (primary and most critical skill)
  • AWS cloud services and architecture
  • IDMC (Informatica Data Management Cloud)
  • Tableau for data visualization
  • Oracle Database management
  • ML Ops practices within Databricks environment (Good to have)
  • STATA for statistical analysis is advantage (Good to have)
  • Amazon SageMaker integration with Databricks (Good to have)
  • DataRobot platform integration (Good to have)

Licence no: 12C6060

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.