Role Overview
We are seeking an experienced Azure Data Architect to design and implement scalable, secure, and high-performing data solutions on Microsoft Azure. The ideal candidate will have strong expertise in Azure Databricks, PySpark, and modern data architecture principles to enable advanced analytics and business intelligence.
Key Responsibilities
- Architect End-to-End Data Solutions
- Design and implement data ingestion, transformation, and storage pipelines using Azure Data Factory (ADF), Azure Databricks, and PySpark.
- Develop lakehouse architectures leveraging Delta Lake and medallion patterns (Bronze/Silver/Gold layers).
- Data Modeling & Governance
- Create conceptual, logical, and physical data models for structured and semi-structured data.
- Implement data governance using Azure Purview and Unity Catalog for metadata and access control.
- Performance & Optimization
- Optimize Databricks clusters for cost and performance; implement auto-scaling and caching strategies.
- Apply PySpark best practices for distributed data processing and transformation.
- Integration & Analytics
- Integrate with Azure Synapse Analytics, Power BI, and other BI tools for reporting and visualization.
- Enable real-time and batch data processing using Event Hubs, Stream Analytics, and Spark Structured Streaming.
- Security & Compliance
- Ensure adherence to data security, privacy, and regulatory compliance standards.
- Implement RBAC, VNet injection, and encryption for data at rest and in transit.
- Collaboration & Leadership
- Work closely with business stakeholders, data engineers, and data scientists to translate requirements into technical solutions.
- Mentor team members on PySpark, Databricks, and Azure best practices.
Required Skills & Qualifications
- Experience:
- 10+ years in data architecture and engineering; 3+ years in Azure cloud data solutions.
- Technical Expertise:
- Azure Services: Data Factory, Databricks, Synapse Analytics, Data Lake Storage Gen2, Cosmos DB.
- Big Data Tools: PySpark, Delta Lake, SparkSQL.
- Programming: Python (with Pandas, NumPy), SQL; familiarity with Scala is a plus.
- Data Architecture:
- Strong knowledge of dimensional modeling, star/snowflake schemas, and NoSQL.
- Other Skills:
- CI/CD with Azure DevOps, Terraform for infrastructure as code.
- Excellent communication and stakeholder management skills.