Job Search and Career Advice Platform

Enable job alerts via email!

Cloud Data Architect

Virtusa

Toronto

On-site

CAD 100,000 - 130,000

Full time

8 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology services firm is seeking a Cloud Data Architect to define and own enterprise data architecture patterns, including lakehouse, data lake, and data governance. The successful candidate will create reference architectures for data pipelines using Databricks and AWS services, focusing on performance optimization and reliability standards. This role requires expertise in governance, security, and orchestration tools like Airflow. Excellent opportunity to lead transformative data initiatives in a collaborative environment.

Qualifications

  • Expertise in designing enterprise data architecture including lakehouses and data lakes.
  • Experience with governance standards and data quality metrics.
  • Proficiency in orchestration tools like Airflow for data pipeline management.

Responsibilities

  • Define and own enterprise data architecture patterns aligned with business needs.
  • Create reference architectures for data pipelines using Databricks and AWS services.
  • Implement data governance and reliability standards for data quality and security.

Skills

Data architecture patterns
Enterprise data governance
Performance optimization
Event-driven ingestion
Data orchestration

Tools

Databricks
Airflow
Kafka
AWS Glue
Google BigQuery
Job description
Job Description - Cloud Data Architect (CREQ239726)
Job Description
Description

Define and own enterprise data architecture patterns: lakehouse, data warehouse, data lake, data vault and dimensional models, aligned to business needs and regulatory requirements.

Create reference architectures for batch, streaming, and transactional pipelines using Databricks (DLT, Autoloader, Unity Catalog, SQL Warehouse).

  • S3
  • Glue
  • Lambda
  • MSK/Kinesis
  • EMR
  • Redshift
  • Step Functions
  • MWAA
  • Dataflow
  • Dataproc
  • BigQuery
  • Composer
  • GCS
  • Cloud Functions
  • Vertex AI

Establish performance optimization guidelines for PySpark/Spark: memory tuning, shuffle/partition strategies, UDF optimization, RAPIDS/GPU acceleration.

Design event‑driven ingestion and CDC architectures (GoldenGate, Kafka/MSK, Kinesis, Glue/Airflow operators).

Governance, Security & Reliability

Implement data governance: Unity Catalog, access controls, lineage, PII handling, encryption/decryption in transit and at rest.

Define observability & reliability standards: data quality (DQ), schema evolution, incident management, SLAs/SLOs, cost guardrails.

DevOps & Orchestration

Standardize Airflow/MWAA/Composer orchestration with reusable operators and DAG patterns.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.