Databricks Lead / Data Architect at VDart Inc — NeverHard
Databricks Lead / Data Architect at VDart Inc in Mississauga, Peel region. Skills: ADLS Gen2, Azure, Azure Data Factory, Data Architecture, Data Engineering. Apply on NeverHard.
Company
VDart Inc
Location
Mississauga, Peel region
Type
contract
Required skills:
ADLS Gen2
Azure
Azure Data Factory
Data Architecture
Data Engineering
DataStage
Databricks
Delta Lake
Delta Live Tables
ETL
Role: Databricks Lead / Data Architect
Location: Mississauga, ON
Contract
About the Role
We are hiring a senior, deeply hands-on Databricks Lead / Data Architect to drive the Databricks workstream of a large-scale data and AI modernization program for a major Canadian enterprise retail client. This is a build-and-lead role: you will own the technical direction of Databricks-based solutions end to end -architecture, Lakehouse design, data engineering, migration of legacy ETL workloads, and production operations - while remaining personally hands-on in code and design.
You will work side by side with the client's VP of Data & AI and the AVP of Data Platforms & Integration and their teams, acting as the senior technical authority who turns strategy into delivered, production-grade outcomes. The immediate focus is modernizing a large on-premise ETL estate (IBM DataStage) to an Azure-native Lakehouse on Azure Data Factory and Databricks, and then scaling the platform to power enterprise analytics and AI use cases.
Key Responsibilities
Architect the Lakehouse
: Design and own scalable, secure Databricks Lakehouse architecture on Azure (Delta Lake, Unity Catalog, medallion bronze/silver/gold, ADLS Gen2) aligned to enterprise standards.
Stay hands-on
: Personally build and review PySpark / Spark SQL pipelines, Delta Live Tables, notebooks, and orchestration - setting the engineering bar, not just directing it.
Lead legacy migration:
Drive conversion of complex legacy ETL (DataStage) workloads to Databricks/PySpark and ADF, including patterns, accelerators, and reusable frameworks for code conversion and validation.
Own performance & cost
: Optimize cluster configuration, job performance, partitioning, and cost; establish FinOps and right-sizing practices on Databricks.
Embed governance
: Implement data governance, lineage, quality, and access control through Unity Catalog and Purview; ensure security, privacy, and compliance by design.
Enable analytics & AI
: Design Gold-layer semantic models and feature pipelines that serve BI (Power BI), advanced analytics, and ML/GenAI use cases (MLflow, Azure ML).
Lead the squad
: Provide technical leadership and mentoring to data engineers; define best practices, coding standards, CI/CD (Azure DevOps), and review processes.
Partner with the client
: Work closely with the client's VP (Data & AI), AVP (Data Platforms & Integration), platform architects, and business stakeholders to translate requirements into delivery roadmaps and measurable outcomes.
Required Qualifications (Must-Have
)
12+ years in data engineering / data platform architecture, with 4+ years of deep, hands-on Databricks delivery.
Expert-level Databricks: Spark (PySpark & Spark SQL), Delta Lake, Delta Live Tables, Unity Catalog, Workflows, performance tuning, and cluster/cost optimization.
Strong Azure data stack: Azure Data Factory, ADLS Gen2, Azure Key Vault, Azure DevOps (CI/CD), and Azure networking/security fundamentals.
Proven migration track record: Led at least one large-scale migration from legacy ETL (e.g., DataStage, Informatica, Teradata) to a cloud lakehouse, including complex transformation logic.
Lakehouse design depth: Medallion architecture, dimensional & semantic modelling, SCD handling, surrogate keys, and data quality / reconciliation frameworks.
Engineering rigor: CI/CD, version control (Git), automated testing/validation, observability, and production support of mission-critical pipelines.
Leadership with hands-on credibility: Demonstrated ability to lead engineers and engage senior client stakeholders while still contributing code and designs directly.
Location: Based in the Greater Toronto Area / Ontario, eligible to work in Canada, and able to work on a hybrid basis with client-site presence as required.
Preferred / Nice-to-Have
Databricks certifications (e.g., Databricks Certified Data Engineer Professional / Solutions Architect) and relevant Azure certifications (DP-203, AZ-305).
Experience in retail, supply chain, merchandising, or financial-services data domains.
Familiarity with IBM DataStage, DB2, Oracle, and legacy on-prem ETL estates.
Exposure to agentic AI / GenAI patterns, MLOps/LLMOps, and AI-assisted code migration tooling.
Experience operating a warm-standby DR and high-availability data platform.