As a Data and Analytics Engineer, you will be responsible for integrating data sources, building, managing, and orchestrating data pipelines in cloud-based data platforms (e. g., AWS, Azure, GCP).
Responsibilities:
- Work closely with cross-functional teams to understand business requirements and translate them into scalable and efficient cloud solutions.
- Maintaining, updating, and refining technical specifications and design documentation.
- Build scalable, robust, and secure cloud-native applications and services using modern cloud platforms, such as AWS, Azure, or Google Cloud Platform.
- Collaborate with stakeholders to gather and analyze business requirements and translate them into technical specifications and design documents.
- Write clean, efficient, and maintainable code in languages such as Python (PySpark), following best practices and design patterns.
- Utilize AWS and Azure-specific services and technologies, such as AWS Glue, Azure Data Factory, Data bricks, Azure Data Lake Storage, and serverless compute services like Azure Functions, Azure Container Apps, and Azure Kubernetes Services to build scalable and flexible applications.
- Integrate with various data storage solutions, including databases, data warehouses, and object storage, to manage and process large datasets.
- Implement security controls and best practices to ensure the confidentiality and integrity of data in the cloud.
- Perform thorough functional/unit testing and debugging of applications and services to ensure they meet quality standards and functional requirements.
- Stay up to date with emerging cloud technologies, trends, and best practices, and recommend innovative solutions to enhance the development process.
Requirements:
- Need to have significant knowledge of cloud-based data platforms (e. g., AWS, Azure, GCP) and its native services e. g., Azure Data Lake, Data Integration Pipelines (Azure Data Factory), Data bricks, SQL/NoSQL, Serverless Compute services like Azure Functions, Azure Container Apps, Azure Kubernetes Services, etc. with focus on building analytics initiatives for the enterprise.
- Hands-on experience in implementing ETL/ELT pipelines using any open-source distributed environment framework (e. g., Apache SPARK) or cloud-based data platforms.
- Strong programming skillset, knowledge of cloud platforms, ability to solve service integration challenges, and implement robust cloud-based systems.
- 2+ years of experience in data engineering in the Cloud.
- Should have in-depth knowledge and hands-on experience with data structures and algorithms, object-oriented programming, etc.
- Bachelor's degree in computer science, software engineering, or a related field (or equivalent work experience).
- Proven experience working as a data engineer/cloud developer or in a similar role, with a focus on developing cloud-native applications.
- Strong proficiency in Python, SQL, PySpark, and at least one programming language, such as Java or C#.
- Hands-on experience with cloud platform services and tools, such as AWS, Azure, or Google Cloud Platform.
- Experience with cloud development frameworks, such as AWS SDK, Azure SDK, or Google Cloud Client Libraries.
- Understanding concepts of databases, data lakes, data warehouses, and data storage technologies, such as SQL, NoSQL, and object storage.
- Application development in serverless compute resources like Azure Functions, Azure Container Apps, Azure Kubernetes Services, etc.
- Strong problem-solving and analytical skills, with the ability to debug and troubleshoot complex issues in a distributed cloud environment.
- Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
- Relevant certifications, such as AWS Certified Developer - Associate or Azure Developer Associate, are highly desirable.
- Demonstrated experience working in an Agile environment.