Job Title:Data Engineer – GCP & Python Development
Location:Houston, TX
Key Responsibilities- Design, develop, and maintain reliable and scalabledata pipelineson GCP using tools likeDataflow, BigQuery, Pub/Sub, and Cloud Composer.
- Write efficient and reusablePython scripts and modulesfor ETL/ELT workflows and data transformations.
- Collaborate with data scientists, analysts, and other engineers to integrate data from various sources, ensure data quality, and optimize performance.
- Build and manage data lake and data warehouse solutions leveragingBigQuery and Cloud Storage.
- Automate data validation and monitoring workflows for data integrity and reliability.
- Implement CI/CD pipelines for data engineering workflows using tools likeCloud Build, GitHub Actions, or Jenkins.
- Monitor and optimize job performance, cost efficiency, and error handling across GCP services.
- Maintain proper documentation of data flows, schemas, and transformation logic.
RequirementsTechnical Skills- Strong proficiency inPython, with experience in writing scalable, modular, and testable code.
- Solid experience withGoogle Cloud Platform– especiallyBigQuery, Cloud Functions, Cloud Storage, Dataflow, Pub/Sub, Cloud Composer (Airflow).
- Experience withSQLand building optimized queries for large-scale data processing.
- Hands-on experience withdata orchestration toolslike Apache Airflow (Composer preferred).
- Knowledge of data modeling, data warehousing concepts, and data governance best practices.
- Familiarity with Docker, Terraform, or Kubernetes is a plus.