Enable job alerts via email!

Aws Data Engineer

Huntwave

Pretoria

On-site

ZAR 600,000 - 900,000

Full time

14 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in data solutions seeks a Data Engineer to manage and optimize data infrastructure. The role involves designing scalable data architectures, developing ETL pipelines, and ensuring data security and compliance. Candidates should possess a Bachelor's degree in Computer Science or Engineering and have extensive experience with AWS services and data engineering practices.

Qualifications

5+ years working experience in data engineering development.
Experience with AWS services for data warehousing and transformations.
Strong skills in Python, especially PySpark for AWS Glue.

Responsibilities

Design and maintain scalable data architectures using AWS services.
Develop and optimize scalable ETL pipelines with AWS Glue and PySpark.
Automate data workflows ensuring fault tolerance and optimization.

Skills

Python

Data modeling

Schema design

Database optimization

AWS services

Education

Bachelor's degree in Computer Science or Engineering

Honors degree in Computer Science or Engineering

Tools

AWS Glue

AWS S3

AWS Lambda

SQL

NoSQL

Job Purpose:

Responsible for creating and managing the technological part of data infrastructure in every step of data flow. From configuring data sources to integrating analytical tools, all these systems would be architected, built, and managed by a general-role data engineer.

Minimum Education (Essential):

Bachelor's degree in Computer Science or Engineering (or similar)

Minimum Education (Desirable):

Honors degree in Computer Science or Engineering (or similar)
AWS Certified Data Engineer
AWS Certified Solutions Architect
AWS Certified Data Analyst

Minimum Applicable Experience (Years):

5+ years working experience

Required Nature of Experience:

Data Engineering development
Experience with AWS services used for data warehousing, computing, and transformations (e.g., AWS Glue, S3, Lambda, Step Functions, Athena, CloudWatch)
Experience with SQL and NoSQL databases (e.g., PostgreSQL, MySQL, DynamoDB)
Experience with SQL for querying and transforming data

Skills and Knowledge (Essential):

Strong skills in Python (especially PySpark for AWS Glue)
Strong knowledge of data modeling, schema design, and database optimization
Proficiency with AWS and infrastructure as code

Skills and Knowledge (Desirable):

Knowledge of SQL, Python, AWS serverless microservices
Deploying and managing ML models in production
Version control (Git), unit testing, and agile methodologies

Data Architecture and Management (20%)

Design and maintain scalable data architectures using AWS services like S3, Glue, and Athena
Implement data partitioning and cataloging strategies
Work with schema evolution and versioning to ensure data consistency
Develop and manage metadata repositories and data dictionaries
Support data access roles and privileges setup and maintenance

Pipeline Development and ETL (30%)

Design, develop, and optimize scalable ETL pipelines with AWS Glue and PySpark
Implement data extraction, transformation, and loading processes
Optimize ETL jobs for performance and cost efficiency
Develop and integrate APIs for data workflows
Integrate data pipelines with ML workflows for scalable deployment

Automation, Monitoring, and Optimization (30%)

Automate data workflows ensuring fault tolerance and optimization
Implement logging, monitoring, and alerting
Optimize ETL performance and resource usage
Optimize storage solutions for performance, cost, and scalability
Deploy ML models into production using AWS Sagemaker

Security, Compliance, and Best Practices (10%)

Ensure API security, authentication, and access control
Implement data encryption and compliance with GDPR, HIPAA, SOC2
Establish data governance policies

Development, Team Mentorship, and Collaboration (5%)

Work with data scientists, analysts, and business teams to understand data needs
Collaborate with backend teams for CI/CD integration
Mentor team members through coaching and code reviews
Align technology with B2C division strategy
Identify growth areas within the team

QMS and Compliance (5%)

Document data processes and architectural decisions
Maintain high software quality standards and compliance with QMS, security, and data standards
Ensure compliance with ISO, CE, FDA, and other relevant standards
Safeguard confidential information and data

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.