Job Description: Principal Databricks Data Engineer
Toronto, ON - Hybrid (4 Days WFO)
Contract-6M-12M
Role Descriptions: Principal Data Engineer Databricks (Finance Risk Cloudera Modernization)
- Experience12-18 years overall data engineering experience
- 8 years across enterprise Data Warehouse Data Lake platforms
- 5 years with Databricks Spark at scale
Role Summary -The Principal Databricks Data Engineer is a senior technical leader responsible for modernizing large scale finance and risk data platforms from legacy Cloudera ecosystems to cloud native Databricks lakehouse architectures.
This role demands deep hands on expertise in data warehousing| data lakes| finance risk data models| and semantic consumption layers| with strong experience supporting regulatory| management reporting| and analytics use cases.
The individual will serve as a hands on architect and technical authority| leading platform modernization while partnering closely with Finance| Risk| Analytics| and Governance stakeholders.
Key Responsibilities
- Cloudera Databricks Modernization Lead modernization of legacy Cloudera platforms (CDH CDP| Hive| HBase| Impala| Spark) into Databricks Lakehouse.
- Redesign ingestion| transformation| and consumption patterns from HDFS centric architectures to cloud object storage and Delta Lake.
- Refactor legacy Hive Impala logic into PySpark Spark SQL based ELT pipelines.
- Ensure data parity| reconciliation| and audit integrity during platform migration.
- Enterprise Data Warehouse Data Lake ArchitectureDesign and govern enterprise Data Warehouse and Data Lake Lakehouse architectures.
- Implement layered architectures spanning Raw Landing zones
- Curated conformed layers
- Semantic consumption layers Modernize traditionl
- EDW patterns into domain aligned
- scalable lakehouse designs.
- Finance Risk Data Modeling Support implementation of finance and risk data models
including
- General Ledger Sub ledger data
- Accounting events and financial hierarchies
- Risk exposure| liquidity| credit| and market risk models Enable aggregation| drill down| and drill back from reports to transaction level data. Support regulatory reporting| management reporting| and analytics use cases.
- Semantic Consumption Layers Build and manage semantic consumption layers to ensure consistent business logic across
BI and reporting tools
Finance Risk analyticsoSelf service analytics platforms Define metrics| dimensions| hierarchies| and KPIs aligned to finance and risk definitions.
- Implement semantic models using Databricks SQL
- Delta tables odbt or equivalent transformation frameworks Databricks Engineering Optimization Engineer large scale pipelines using PySpark| Spark SQL| and Delta Lake.
- Implement medallion architecture (Bronze Silver Gold) aligned to business domains.
- Optimize Databricks workloads for cost| performance| and reliability (Z ORDER| OPTIMIZE| caching| cluster policies).Data Governance| Quality Lineage Implement data quality frameworks| reconciliation co
Pay: $60.00-$253,082.90 per hour
Work Location: Hybrid remote in Toronto, ON (Toronto District)