Senior Data Engineer --- PySpark
Location: Pune / PAN India
Job Type: Full-time
Experience: 5+ years
Job Summary
We are looking for a highly experienced Senior Data Engineer with strong expertise in PySpark and large-scale data processing to join our data engineering team. In this role, you will be responsible for designing, building, and maintaining advanced data pipelines that support enterprise-level analytics and data science initiatives.
Key Responsibilities
- Design, implement, and maintain scalable and efficient ETL/ELT pipelines using PySpark.
- Work with big data frameworks such as Apache Spark, Hadoop, Hive, and Delta Lake.
- Handle large volumes of structured and unstructured data across various sources (databases, APIs, flat files, etc.).
- Develop and optimize complex data transformation workflows and batch/streaming jobs.
- Ensure data quality, integrity, and governance throughout the data lifecycle.
- Collaborate closely with data scientists, analysts, DevOps, and business stakeholders.
- Troubleshoot performance issues and optimize PySpark jobs for distributed environments.
- Manage workflows using orchestration tools such as Apache Airflow or similar.
- Contribute to the architecture and design of scalable, fault-tolerant data platforms.
Required Skills & Qualifications
- 5+ years of experience in data engineering roles.
- Strong hands-on experience with PySpark for data transformation and processing.
- Proficient in SQL and Python.
- Deep understanding of distributed computing concepts, data partitioning, and performance tuning.
- Experience with cloud platforms such as AWS, Azure, or Google Cloud Platform.
- Experience with data lakehouse architectures (e.g., Delta Lake, Databricks).
- Knowledge of version control (Git), CI/CD, and agile methodologies.
Nice to Have
- Exposure to streaming data technologies like Kafka, Spark Streaming, or Flink.
- Experience with Snowflake, Redshift, or BigQuery.
- Knowledge of data governance, data lineage, and cataloging tools (e.g., Collibra, Alation).
- Familiarity with containerization (Docker, Kubernetes).
Share your resume @ avinash.allure@bhasaka.com