Summary
This role is ideal for someone with strong expertise in Snowflake and/or Databricks, advanced Python and SQL skills, and experience building scalable data pipelines and high-performance, low-latency APIs. Require to modernize our data infrastructure and drive innovation across analytics, engineering, and AI/ML workloads.
Key Responsibilities
- Design, build, and maintain robust and scalable data pipelines using Python, SQL, and ETL tools like Informatica, ADF, SSIS, or Talend.
- Architect and operationalize cloud-based data platforms leveraging Snowflake and/or Databricks for storage, transformation, and analytics.
- Build high-performance, low-latency APIs to enable real-time data access and power scalable applications.
- Collaborate with cross-functional teams to gather business requirements and translate them into efficient, production-grade data engineering solutions.
- Design and implement data models, including dimensional modeling, to support reporting and advanced analytics use cases.
- Optimize SQL performance and automate data workflows through CI/CD pipelines and orchestration frameworks.
- Integrate structured, semi-structured, and unstructured data from diverse sources including APIs and large transactional systems.
- Ensure data reliability, quality, and governance using tools like Alation or Talend DQ.
- Support data science teams with engineered datasets to power AI/ML/LLM models and business insights.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or related field.
- 3+ years of experience in data engineering, working with large-scale data systems and complex transformation logic.
- Strong programming skills in Python and deep expertise in SQL.
- Proven experience with Snowflake and/or Databricks, including performance tuning, workload optimization, and cloud-native architecture.
- Hands-on experience with Informatica or other modern ETL tools (ADF, Talend, SSIS).
- Experience building and managing RESTful APIs to support real-time applications.
- Solid understanding of data modeling, data warehousing, and data architecture best practices.
- Familiarity with data governance, quality validation, and metadata cataloging tools.
- Comfortable working in agile teams and driving deliverables independently.
Preferred Qualifications
- Experience with Snowflake Cortex for AI/ML and LLM integration is a strong plus.
- Familiarity with Spark, Delta Lake, or similar big data frameworks.
- Exposure to AI/LLM model pipelines, prompt engineering, or chatbot data ingestion.
- Certifications in Snowflake, Databricks, or cloud platforms (Azure, AWS, GCP).
- Experience in semiconductor or high-tech industries is a bonus.