Executive Summary π‘
13+ years Senior Data Platform Architect specializing in large-scale distributed data systems, data governance (lineage, quality, compliance), and cloud-native platforms. Delivered enterprise data platforms processing 2TB+ daily with 70% latency reduction and 99.9% SLA. Expert in orchestration (Airflow/Kubernetes), observability (Datadog/Prometheus), IaC (Terraform), multi-cloud (AWS/Azure), Data Mesh principles, and AI/ML integration across Salesforce, Apple, Qualcomm, IBM, PwC.
Professional Experience πΌ
Databricks β’ Spark β’ Python β’ Data Platform Engineering β’ Data Governance β’ AI/ML Pipelines β’ Orchestration
- Leading data platform initiatives with Databricks / Spark for enterprise analytics and AI/ML workloads for Go To Market product
- Establishing data governance standards including quality frameworks and pipeline observability
- Building AI-ready data infrastructure on Databricks enabling GenAI model deployment and ML feature engineering
AWS β’ Spark β’ Kafka β’ Python β’ Airflow β’ Terraform β’ Kubernetes β’ Datadog β’ Prometheus β’ GenAI β’ GraphDB
- Architected cloud-native data platform with event-driven pipelines, achieving 70% latency reduction using Spark/EKS/Kafka orchestration
- Implemented platform observability with Datadog/Prometheus; established SLAs and data quality monitoring for CI/CD systems
- Built IaC automation (Terraform/EKS/Jenkins) reducing deployment time by 70%; enabled developer self-service patterns
- Led GenAI/AI integration into data infrastructure achieving 25-40% performance improvement
Apple Cloud β’ AWS β’ Python β’ Kafka β’ PySpark β’ Snowflake β’ Docker β’ Kubernetes β’ EMR β’ Multi-Cloud Migration
- Led 12-member engineering team architecting enterprise data platform processing 100GB+ daily with 99.9% SLA
- Built real-time data pipelines (Kafka) handling 1M+ msgs/hour with comprehensive data quality and lineage tracking
- Led cloud migration (on-prem to AWS) with Data Mesh principles, reducing costs 50% and improving scalability 300%
- Established data governance standards for data discovery, quality controls, and compliance across supply-chain domain
- Built data APIs with semantic models and metadata management serving 10+ internal analytics teams
Big-Data Platforms β’ Spark β’ Python β’ HIVE β’ MapR β’ Kafka β’ Airflow β’ TensorFlow β’ Hadoop β’ Data Lakehouse
- Architected large-scale distributed data platform processing 2TB+ daily with Spark/HIVE/MapR DB in Hadoop ecosystem
- Built AI/ML data pipelines with 92% model accuracy; optimized data infrastructure reducing resource usage 25%
- Implemented data quality and monitoring frameworks ensuring data reliability across enterprise analytics workloads
- Won Innovation Maestro, HaQkathon; presented distributed platform architecture to CxO leadership
- Designed orchestration workflows (Airflow) and data transformation frameworks improving pipeline efficiency by 35%
- Partnered with engineering and analytics teams enabling developer self-service access to 15+ data products
Azure β’ Multi-Cloud β’ Data Factory β’ MongoDB β’ Python β’ Cloud Migration β’ Data Integration
- Led 6-member team architecting multi-cloud data platform on Azure, delivered ahead of schedule
- Migrated 500GB+ data with 99.9% integrity using Azure Data Factory; established data quality SLAs
- Implemented IaC automation and compliance frameworks meeting regulatory requirements across 3 geographic regions
BI β’ Data Warehouse β’ SQL Server β’ Oracle β’ SSIS β’ ETL β’ Python β’ Data Visualization β’ Dashboards
- Architected BI platform with OLTP/OLAP optimization querying billion+ rows in 2-5 sec; delivered dashboards for business analytics
- Built scalable ETL pipelines processing 50M+ rows daily; Won STAR Performer for data platform excellence
- Implemented logical data modeling and schema versioning improving data consistency by 40% across analytics workloads
- Collaborated with business stakeholders across Sales, Finance enabling data-driven decision-making for 20+ teams