Enable job alerts via email!

Google Cloud Platform (Gcp) Data Engineer

Tshinwelo Innovative Business Solutions [Tibs]

Johannesburg

On-site

ZAR 400,000 - 600,000

Full time

16 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading business consulting firm is seeking a Data Engineer to design and maintain data pipelines and collaborate with machine learning teams on Google Cloud Platform. The ideal candidate will possess experience in GCP and proficiency in Python and PySpark, ensuring data quality through effective monitoring processes. This entry-level role offers an excellent opportunity to grow skills in a dynamic environment.

Qualifications

Proven experience as a Data Engineer focusing on GCP.
Proficiency in Python and PySpark.
Strong problem-solving skills and excellent communication abilities.

Responsibilities

Design, develop, and maintain scalable data pipelines and ETL processes on GCP.
Collaborate with data scientists and machine learning engineers.
Ensure data quality, integrity, and consistency.

Skills

Machine Learning

Data visualization

Google Cloud Platform

PySpark

Education

Bachelor's or Master's degree in Computer Science, Engineering, or related field

Tools

BigQuery

Cloud Storage

Dataflow

Terraform

Docker

Kubernetes

Job Description for Data Engineer at Tshinwelo Innovative Business Solutions (TIBS)We are seeking a skilled Data Engineer to join our team.

The responsibilities include : Design, develop, and maintain scalable data pipelines and ETL processes on Google Cloud Platform (GCP).Implement and optimize data storage solutions using BigQuery, Cloud Storage, and other GCP services.Collaborate with data scientists, machine learning engineers, data engineers, and other stakeholders to deploy machine learning models into production.Develop and maintain custom deployment solutions for machine learning models using tools such as Kubeflow, AI Platform, and Docker.Write clean, efficient, and maintainable code in Python and PySpark for data processing and transformation tasks.Ensure data quality, integrity, and consistency through validation and monitoring processes, with a deep understanding of Medallion architecture.Develop metadata-driven pipelines for optimal data processing.Use Terraform to manage and provision GCP cloud infrastructure resources.Troubleshoot and resolve production issues related to data pipelines and machine learning models.Stay current with industry trends and best practices in data engineering, machine learning, and cloud technologies, including data lifecycle management, data pruning, model drift, and optimization.QualificationsMust have skills in : Machine Learning (general experience)Data visualizationGoogle Cloud PlatformPysparkEducational requirements include a Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Proven experience as a Data Engineer focusing on GCP is essential.

Proficiency in Python and PySpark, experience with GCP services (BigQuery, Cloud Storage, Dataflow, AI Platform), Terraform, Docker, Kubernetes, and strong problem-solving skills are required.

Excellent communication and collaboration skills are also necessary.Additional InformationPreferred qualifications include experience with custom deployment solutions, MLOps, knowledge of AWS or Azure, CI / CD pipelines, and certifications in GCP Data Engineering.

Visualization experience is a plus but not mandatory.Please forward your email to :