Python Developer – Data Engineering & Migration (5 months Contract)
CLIENT SUMMARY
Our client is a global advanced analytics and AI consulting firm that helps Fortune 1000 enterprises make smarter, data-driven decisions. With operations in the U.S., India, and other international markets, they combine deep industry knowledge with state-of-the-art AI/ML capabilities to drive business transformation at scale.
This is a contract role for 5–6 months, focused on delivering production-grade Python solutions for enterprise data migration projects.
ROLE SUMMARY
We are seeking an experienced Python Developer with strong expertise in data engineering, cloud migration, and production-grade software practices. This role will involve building robust, maintainable, and scalable data pipelines, with hands-on contributions to critical migration and transformation projects.
RESPONSIBILITIES
- Design and implement Python-based data pipelines for large-scale relational database migrations to cloud-native data platforms
- Write clean, modular, and efficient production-grade Python code using best practices (Separation of Concerns, reusability, readability, testing)
- Perform unit testing, reconciliation, and data quality checks within the migration pipeline
- Optimize data workflows and SQL queries for performance and scalability
- Collaborate with cross-functional teams to understand requirements and translate them into scalable technical solutions
- Support CI/CD workflows and infrastructure automation for deployment and testing
- Document solutions clearly for handover, audit, and maintenance purposes
REQUIREMENTS
- 5–10 years of overall software development or data engineering experience
- Proven track record of writing scalable, modular, and testable Python code in production environments
- Experience with data warehouse implementations, data migrations, and cloud-first architectures
- Solid understanding of ETL frameworks, data reconciliation, and unit testing
- Strong SQL skills including performance tuning, joins, and transformations
- Hands-on experience with PySpark and handling large datasets efficiently
- Familiarity with containerization, CI/CD pipelines, and automated testing workflows
- Excellent problem-solving skills and ability to communicate technical solutions clearly
THE CIORE TOOLS AND TECHNOLOGIES
Python Development (Production-Grade)
- Python 3.x
- Pandas, PyArrow, NumPy
- PySpark (for distributed data processing)
- SQLAlchemy (optional ORM usage)
- Pydantic or similar for data validation
Testing & Quality
- pytest – Unit and integration testing
- flake8, black, isort – Code linting, formatting, style enforcement
- mypy – Static type checking
- coverage.py – Code coverage tracking
Data Engineering / ETL
- Strong SQL (query optimization, data modeling)
- Cloud-native platforms: AWS (S3, RDS, Glue, Lambda) or Databricks, Snowflake
- ETL scripting and pipeline development using Python
DevOps & Deployment
- Docker – Containerization of Python applications
- Terraform or AWS CloudFormation – Infrastructure as code
- CI/CD – GitHub Actions, GitLab CI, or similar
- Logging and monitoring tools (e.g., Loguru, Sentry, Prometheus)
GOOD TO HAVE
- Experience with cloud data platforms like AWS Glue, Databricks, or Snowflake
- Familiarity with Master Data Management (MDM) practices
- Exposure to REST API integrations and working in a microservices environment
- Experience working in Agile teams with tools like Jira, Confluence, Git