Senior Data Engineer to support the modernization of actuarial pipelines by migrating SAS to Python/PySpark and optimizing performance across AWS and Databricks.
Our client is actively seeking a Senior Data Engineer to support the modernization of actuarial pipelines by migrating SAS to Python/PySpark and optimizing performance across AWS and Databricks.
Initial 6 month contract with possibility of extension. Hybrid work model - 2 days in office & 3 days remote - open to Montreal and Toronto.
Responsibilities :
- Review and improve performance of data pipelines.
- Optimize PySpark configurations and database access patterns.
- Migrate and optimize legacy SAS code into Python and modern frameworks (e.g., PySpark, Polars).
- Experience deploying and debugging services in the AWS ecosystem.
- Develop and maintain CI/CD automation pipelines.
- Implement lifecycle and cleanup strategies for AWS S3, SageMaker, and Databricks.
- Support ML model lifecycle using MLflow and SageMaker.
- Help integrate Databricks and AWS workflows, including experiment tracking and model deployment.
- Build tooling to help actuaries and data scientists standardize and optimize their workflows.
- Define usage patterns and best practices (e.g., Python vs PySpark vs Polars).
- Recommend architectural changes to improve performance and scalability.
- Act as a technical bridge between actuarial teams, DevOps, and architecture.
- Contextualize and communicate platform needs to DevOps and cloud architects.
Must-Haves:
- 7+ years experience as a Data Engineer
- PySpark and data pipeline experience.
- Cloud-native architecture, particularly in AWS (SageMaker, S3, Lambda, Glue, etc.).
- ML lifecycle tools (e.g., MLflow, SageMaker Experiments).
- Databricks platform experience.
- Understanding of CI/CD and infrastructure-as-code for ML systems.