Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
A leading business consulting firm is seeking a Data Engineer to design and maintain data pipelines and collaborate with machine learning teams on Google Cloud Platform. The ideal candidate will possess experience in GCP and proficiency in Python and PySpark, ensuring data quality through effective monitoring processes. This entry-level role offers an excellent opportunity to grow skills in a dynamic environment.
Job Description for Data Engineer at Tshinwelo Innovative Business Solutions (TIBS)We are seeking a skilled Data Engineer to join our team.
The responsibilities include : Design, develop, and maintain scalable data pipelines and ETL processes on Google Cloud Platform (GCP).Implement and optimize data storage solutions using BigQuery, Cloud Storage, and other GCP services.Collaborate with data scientists, machine learning engineers, data engineers, and other stakeholders to deploy machine learning models into production.Develop and maintain custom deployment solutions for machine learning models using tools such as Kubeflow, AI Platform, and Docker.Write clean, efficient, and maintainable code in Python and PySpark for data processing and transformation tasks.Ensure data quality, integrity, and consistency through validation and monitoring processes, with a deep understanding of Medallion architecture.Develop metadata-driven pipelines for optimal data processing.Use Terraform to manage and provision GCP cloud infrastructure resources.Troubleshoot and resolve production issues related to data pipelines and machine learning models.Stay current with industry trends and best practices in data engineering, machine learning, and cloud technologies, including data lifecycle management, data pruning, model drift, and optimization.QualificationsMust have skills in : Machine Learning (general experience)Data visualizationGoogle Cloud PlatformPysparkEducational requirements include a Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Proven experience as a Data Engineer focusing on GCP is essential.
Proficiency in Python and PySpark, experience with GCP services (BigQuery, Cloud Storage, Dataflow, AI Platform), Terraform, Docker, Kubernetes, and strong problem-solving skills are required.
Excellent communication and collaboration skills are also necessary.Additional InformationPreferred qualifications include experience with custom deployment solutions, MLOps, knowledge of AWS or Azure, CI / CD pipelines, and certifications in GCP Data Engineering.
Visualization experience is a plus but not mandatory.Please forward your email to :