Skip to main content

Video Guides, Tutorials & Hands on labs

  1. "Insert | Update | Delete On Datalake (S3) with Apache Hudi and glue Pyspark - By Soumil Shah, Nov 17th 2022

  2. "Build a Spark pipeline to analyze streaming data using AWS Glue, Apache Hudi, S3 and Athena" - By Soumil Shah, Nov 19th 2022

  3. "Different table types in Apache Hudi | MOR and COW | Deep Dive | By Sivabalan Narayanan - By Soumil Shah, Nov 20th 2022

  4. "Simple 5 Steps Guide to get started with Apache Hudi and Glue 4.0 and query the data using Athena" - By Soumil Shah, Dec 8th 2022

  5. "Build Datalakes on S3 with Apache HUDI in a easy way for Beginners with hands on labs | Glue" - By Soumil Shah, Dec 11th 2022

  6. "How to convert Existing data in S3 into Apache Hudi Transaction Datalake with Glue | Hands on Lab" - By Soumil Shah, Dec 14th 2022

  7. "Build Slowly Changing Dimensions Type 2 (SCD2) with Apache Spark and Apache Hudi | Hands on Labs" - By Soumil Shah, Dec 14th 2022

  8. "Hands on Lab with using DynamoDB as lock table for Apache Hudi Data Lakes" - By Soumil Shah, Dec 14th 2022

  9. "Build production Ready Real Time Transaction Hudi Datalake from DynamoDB Streams using Glue &kinesis" - By Soumil Shah, Dec 15th 2022

  10. "Step by Step Guide on Migrate Certain Tables from DB using DMS into Apache Hudi Transaction Datalake" - By Soumil Shah, Dec 17th 2022

  11. "Migrate Certain Tables from ONPREM DB using DMS into Apache Hudi Transaction Datalake with Glue|Demo" - By Soumil Shah, Dec 17th 2022

  12. "Insert|Update|Read|Write|SnapShot| Time Travel |incremental Query on Apache Hudi datalake (S3)" - By Soumil Shah, Dec 18th 2022

  13. "Build Production Ready Alternative Data Pipeline from DynamoDB to Apache Hudi | PROJECT DEMO" - By Soumil Shah, Dec 19th 2022

  14. "Build Production Ready Alternative Data Pipeline from DynamoDB to Apache Hudi | Step by Step Guide" - By Soumil Shah, Dec 19th 2022

  15. "Getting started with Kafka and Glue to Build Real Time Apache Hudi Transaction Datalake" - By Soumil Shah, Dec 20th 2022

  16. "Learn Schema Evolution in Apache Hudi Transaction Datalake with hands on labs" - By Soumil Shah, Dec 21st 2022

  17. "Apache Hudi with DBT Hands on Lab.Transform Raw Hudi tables with DBT and Glue Interactive Session" - By Soumil Shah, Dec 23rd 2022

  18. Apache Hudi on Windows Machine Spark 3.3 and hadoop2.7 Step by Step guide and Installation Process - By Soumil Shah, Dec 24th 2022

  19. Lets Build Streaming Solution using Kafka + PySpark and Apache HUDI Hands on Lab with code - By Soumil Shah, Dec 24th 2022

  20. Bring Data from Source using Debezium with CDC into Kafka&S3Sink &Build Hudi Datalake | Hands on lab - By Soumil Shah, Dec 27th 2022

  21. Comparing Apache Hudi's MOR and COW Tables: Use Cases from Uber - By Soumil Shah, Dec 28th 2022

  22. Step by Step guide how to setup VPC & Subnet & Get Started with HUDI on EMR | Installation Guide | - By Soumil Shah, Dec 30th 2022

  23. Streaming ETL using Apache Flink joining multiple Kinesis streams | Demo - By Soumil Shah, Jan 1st 2023

  24. Transaction Hudi Data Lake with Streaming ETL from Multiple Kinesis Streams & Joining using Flink - By Soumil Shah, Jan 1st 2023

  25. Great Article|Apache Hudi vs Delta Lake vs Apache Iceberg - Lakehouse Feature Comparison by OneHouse - By Soumil Shah, Jan 11th 2023

  26. Build Real Time Streaming Pipeline with Apache Hudi Kinesis and Flink | Hands on Lab - By Soumil Shah, Jan 12th 2023

  27. Build Real Time Low Latency Streaming pipeline from DynamoDB to Apache Hudi using Kinesis,Flink|Lab - By Soumil Shah, Jan 13th 2023

  28. Real Time Streaming Data Pipeline From Aurora Postgres to Hudi with DMS , Kinesis and Flink |DEMO - By Soumil Shah, Jan 15th 2023

  29. Real Time Streaming Pipeline From Aurora Postgres to Hudi with DMS , Kinesis and Flink |Hands on Lab - By Soumil Shah, Jan 16th 2023

  30. Leverage Apache Hudi upsert to remove duplicates on a data lake | Hudi Labs - By Soumil Shah, Jan 17th 2023

  31. Use Apache Hudi for hard deletes on your data lake for data governance | Hudi Labs - By Soumil Shah, Jan 17th 2023

  32. How businesses use Hudi Soft delete features to do soft delete instead of hard delete on Datalake - By Soumil Shah, Jan 17th 2023

  33. Leverage Apache Hudi incremental query to process new & updated data | Hudi Labs - By Soumil Shah, Jan 17th 2023

  34. Global Bloom Index: Remove duplicates & guarantee uniquness | Hudi Labs - By Soumil Shah, Jan 17th 2023

  35. Cleaner Service: Save up to 40% on data lake storage costs | Hudi Labs - By Soumil Shah, Jan 17th 2023

  36. Precomb Key Overview: Avoid dedupes | Hudi Labs - By Soumil Shah, Jan 17th 2023

  37. How do I identify Schema Changes in Hudi Tables and Send Email Alert when New Column added/removed - By Soumil Shah, Jan 20th 2023

  38. How to detect and Mask PII data in Apache Hudi Data Lake | Hands on Lab - By Soumil Shah, Jan 21st 2023

  39. Learn How to restrict Intern from accessing Certain Column in Hudi Datalake with lake Formation - By Soumil Shah, Jan 28th 2023

  40. How do I Ingest Extremely Small Files into Hudi Data lake with Glue Incremental data processing - By Soumil Shah, Feb 7th 2023