About the job Senior Data Engineer (AWS & Confluent Data/AI Projects) | Remote
Work Set-up: Remote
Schedule: 10am-6pm SGT
Responsibilities:
- Architect and Design Data Solutions: Lead the design and architecture of scalable, secure, and efficient data pipelines for both batch and real-time data processing on AWS. This includes data ingestion, transformation, storage, and consumption layers.
- Confluent Kafka Expertise: Design, implement, and optimize highly performant and reliable data streaming solutions using Confluent Platform (Kafka, ksqlDB, Kafka Connect, Schema Registry). Ensure efficient data flow for real-time analytics and AI applications.
- AWS Cloud Native Development: Develop and deploy data solutions leveraging a wide range of AWS services, including but not limited to:
- Data Storage: S3 (Data Lake), RDS, DynamoDB, Redshift, Lake Formation.
- Data Processing: Glue, EMR (Spark), Lambda, Kinesis, MSK (for Kafka integration).
- Orchestration: AWS Step Functions, Airflow (on EC2 or MWAA)
- Analytics & ML: Athena, QuickSight, SageMaker (for MLOps integration).
Required Skills and Qualifications:
- Bachelor's or Master's degree in Computer Science, Software Engineering, or a related quantitative field.
- 3 to 5 years of experience in data engineering, with a significant focus on cloud-based solutions.
- Extensive hands‑on experience with Confluent Platform/Apache Kafka for building real‑time data streaming applications.
- Proficiency in programming languages such as Python, PySpark, Scala, or Java.
- Expertise in SQL and experience with various database systems (relational and NoSQL).
- Solid understanding of data warehousing, data lakes, and data modeling concepts (star schema, snowflake schema, etc.).
- Experience with CI/CD pipelines and DevOps practices (Git, Terraform, Jenkins, Azure DevOps, or similar).
Preferred Qualifications (Nice to Have):
- AWS Certifications (e.g., AWS Certified Data Analytics - Specialty, AWS Certified Solutions Architect - Associate/Professional).
- Experience with other streaming technologies (e.g., Flink).
- Knowledge of containerization technologies (Docker, Kubernetes).
- Familiarity with Data Mesh or Data Fabric concepts.
- Experience with data visualization tools (e.g., Tableau, Power BI, QuickSight).
- Understanding of MLOps principles and tools.