Data Engineer with 5+ years designing and operating cloud-native batch and streaming data platforms across AWS, Snowflake,
and Databricks ecosystems. Specialized in scalable ETL/ELT, CDC pipelines, lakehouse architectures, and governed analytics
platforms enabling reliable KPIs, real-time analytics, cost optimization, and enterprise data quality for cross-functional business
stakeholders.
Technologies: Certification: Databricks Certified Data Engineer Associate Programming: Python, SQL, Scala, Java, R, Bash, Linux Shell Data Engineering: Apache Spark, Databricks, Delta Lake, dbt (Models, Tests, Incremental, Snapshots), Batch & Streaming Pipelines, CDC Pipelines, Medallion Architecture, Dimensional Modeling (Star, SCD Type 1/2), Event-Driven Ingestion Cloud & Warehousing: AWS (S3, Glue, Lambda, EMR, Kinesis, IAM, CloudWatch), Snowflake (Snowpipe, Streams, Tasks, Time Travel), Azure Data Factory, GCP (Working Knowledge) Streaming & Messaging: Kafka, Kafka Connect, Snowpipe Streaming Governance & Data Quality: Data Contracts, SLAs, Schema Evolution, Data Lineage, PII Tokenization, Data Validation DevOps & Orchestration: Airflow, Jenkins, GitHub, Terraform, Docker, CI/CD Pipelines Analytics Enablement: Tableau, Power BI, KPI Development, Self-Service Data Models
Resume/CV: https://drive.google.com/file/d/1niUOkJZCV2_OalDdBr3HhmGxBZv...
Email: shwetsaoji@gmail.com
Data Engineer with 5+ years designing and operating cloud-native batch and streaming data platforms across AWS, Snowflake, and Databricks ecosystems. Specialized in scalable ETL/ELT, CDC pipelines, lakehouse architectures, and governed analytics platforms enabling reliable KPIs, real-time analytics, cost optimization, and enterprise data quality for cross-functional business stakeholders.