Building production-grade data pipelines at enterprise scale — Azure Databricks, Delta Lake, and PySpark across lakehouse architectures that process hundreds of millions of records.
4 years. 3 enterprises. One data stack.
Senior Data Engineer with 4 years at Celebal Technologies building production-grade pipelines on Azure Databricks. Delivered end-to-end medallion architecture lakehouse solutions for a large-scale conglomerate, migrated 600+ legacy ETL jobs for a leading private-sector bank's credit risk platform, and re-implemented Oracle PL/SQL sales datamarts at 500M+ record scale for a global industrial technology manufacturer.
Comfortable owning the full pipeline lifecycle — from Autoloader ingestion and PySpark transformation to Delta Lake optimisation, ADF orchestration, and CI/CD deployment via Databricks Asset Bundles. Targeting a senior Data Engineer role with meaningful scope on cloud-native data platforms.
Designed and delivered the end-to-end data ingestion layer for a large-scale Enterprise Data Platform, a lakehouse implementation built on Azure Databricks. Owned the full pipeline lifecycle from heterogeneous source systems through the Bronze (Raw) layer to the Silver (Enriched) layer, following a strict medallion architecture pattern.
Executed the end-to-end data engineering migration of a leading private-sector bank's credit risk analytics department from an on-premises 16-node Cloudera Hadoop cluster to the Azure Databricks cloud platform — migrating 600+ Pentaho ETL jobs that process 25 TB of credit risk and campaign management data per month across 80M unique customers.
Executed the legacy-to-cloud migration of a global manufacturer's enterprise sales reporting platform from Oracle on-premises to the Databricks platform, re-implementing five regional sales datamarts on a modern Delta Lake foundation. Converted a large body of Oracle PL/SQL procedures, packages, and Unix shell-scripted conditional logic into production-grade PySpark.
Open to senior Data Engineer roles with scope on cloud-native data platforms, Azure Databricks, and large-scale pipeline architecture. Feel free to reach out.