Enable job alerts via email!

Google Cloud Platform (Gcp) Data Engineer

Tshinwelo Innovative Business Solutions [Tibs]

Johannesburg

On-site

ZAR 400,000 - 600,000

Full time

16 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading business consulting firm is seeking a Data Engineer to design and maintain data pipelines and collaborate with machine learning teams on Google Cloud Platform. The ideal candidate will possess experience in GCP and proficiency in Python and PySpark, ensuring data quality through effective monitoring processes. This entry-level role offers an excellent opportunity to grow skills in a dynamic environment.

Qualifications

  • Proven experience as a Data Engineer focusing on GCP.
  • Proficiency in Python and PySpark.
  • Strong problem-solving skills and excellent communication abilities.

Responsibilities

  • Design, develop, and maintain scalable data pipelines and ETL processes on GCP.
  • Collaborate with data scientists and machine learning engineers.
  • Ensure data quality, integrity, and consistency.

Skills

Machine Learning
Data visualization
Google Cloud Platform
PySpark

Education

Bachelor's or Master's degree in Computer Science, Engineering, or related field

Tools

BigQuery
Cloud Storage
Dataflow
Terraform
Docker
Kubernetes

Job description

Job Description for Data Engineer at Tshinwelo Innovative Business Solutions (TIBS)We are seeking a skilled Data Engineer to join our team.

The responsibilities include : Design, develop, and maintain scalable data pipelines and ETL processes on Google Cloud Platform (GCP).Implement and optimize data storage solutions using BigQuery, Cloud Storage, and other GCP services.Collaborate with data scientists, machine learning engineers, data engineers, and other stakeholders to deploy machine learning models into production.Develop and maintain custom deployment solutions for machine learning models using tools such as Kubeflow, AI Platform, and Docker.Write clean, efficient, and maintainable code in Python and PySpark for data processing and transformation tasks.Ensure data quality, integrity, and consistency through validation and monitoring processes, with a deep understanding of Medallion architecture.Develop metadata-driven pipelines for optimal data processing.Use Terraform to manage and provision GCP cloud infrastructure resources.Troubleshoot and resolve production issues related to data pipelines and machine learning models.Stay current with industry trends and best practices in data engineering, machine learning, and cloud technologies, including data lifecycle management, data pruning, model drift, and optimization.QualificationsMust have skills in : Machine Learning (general experience)Data visualizationGoogle Cloud PlatformPysparkEducational requirements include a Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Proven experience as a Data Engineer focusing on GCP is essential.

Proficiency in Python and PySpark, experience with GCP services (BigQuery, Cloud Storage, Dataflow, AI Platform), Terraform, Docker, Kubernetes, and strong problem-solving skills are required.

Excellent communication and collaboration skills are also necessary.Additional InformationPreferred qualifications include experience with custom deployment solutions, MLOps, knowledge of AWS or Azure, CI / CD pipelines, and certifications in GCP Data Engineering.

Visualization experience is a plus but not mandatory.Please forward your email to :

  • DetailsSeniority level : Entry levelEmployment type : ContractJob function : Information TechnologyIndustry : Business Consulting and Services
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.