Apache Hudi does XYZ (1/10): File pruning with multi-modal indexJune 16, 2025 byShiyan Xuhudisparkblogcoursetutorialdatumagicdata lakelakehouseapache hudiapache spark
Use open table format libraries on AWS Glue 5.0 for Apache SparkDecember 4, 2024 bySotaro Hikita and Noritaka Sekiyamaannouncementblogapache hudiaws glueapache sparktable formatamazon
Apache Hudi, Spark and Minio: Hands-on Lab in DockerOctober 2, 2024 bySanjeet Shuklahow-toApache HudiApache SparkMiniodockerdevgenius
Hands-on with Apache Hudi and SparkSeptember 22, 2024 bySanjeet ShuklablogApache HudiApache Sparkdevgenius
Apache Hudi: From Zero To One (10/10)April 13, 2024 byShiyan Xuhudisparkblogcoursetutorialdatumagicdata lakelakehouseapache hudiapache spark
Apache Hudi: From Zero To One (9/10)March 5, 2024 byShiyan Xuhudisparkblogcoursetutorialdatumagicdata lakelakehouseapache hudiapache spark
Building Data Lakes on AWS with Kafka Connect, Debezium, Apicurio Registry, and Apache HudiFebruary 27, 2024 byGary A. Staffordblogapache hudiitnextbeginnerapache kafkakafka connectdebeziumapicurio registryawsapache sparkdeltastreamerhudi streameramazon rdsamazon mksamazon eksaws glueamazon emr
Building an Open Source Data Lake House with Hudi, Postgres Hive Metastore, Minio, and StarRocksFebruary 6, 2024 bySoumil Shahblogapache hudilinkedinbeginnerapache sparkapache hivehive metastoreminiostarrocksdockerpythonpostgrespostgresql
Apache Hudi: Managing Partition on a petabyte-scale tableFebruary 4, 2024 byKrishna Prasadblogapache hudimediumintermediatepartitionaws glueapache sparkaws s3
Leverage Partition Paths of your data lake tables to Optimize Data Retrieval Costs on the cloudJanuary 30, 2024 byKrishna Prasadblogapache hudimediumintermediateaws gluecostapache sparkpartition
Data Engineering: Bootstrapping Data lake with Apache HudiJanuary 20, 2024 byKrishna Prasadblogapache hudimediumbeginnerETLaws glueapache sparkaws s3
Learn How to Move Data From MongoDB to Apache Hudi Using PySparkJanuary 20, 2024 bySoumil Shahblogapache hudilinkedinbeginnermongodbapache sparkpyspark