Building Data Lakes on AWS with Kafka Connect, Debezium, Apicurio Registry, and Apache HudiFebruary 27, 2024 byGary A. Staffordblogapache hudiitnextbeginnerapache kafkakafka connectdebeziumapicurio registryawsapache sparkdeltastreamerhudi streameramazon rdsamazon mksamazon eksaws glueamazon emr
Building an Open Source Data Lake House with Hudi, Postgres Hive Metastore, Minio, and StarRocksFebruary 6, 2024 bySoumil Shahblogapache hudilinkedinbeginnerapache sparkapache hivehive metastoreminiostarrocksdockerpythonpostgrespostgresql
Apache Hudi: Managing Partition on a petabyte-scale tableFebruary 4, 2024 byKrishna Prasadblogapache hudimediumintermediatepartitionaws glueapache sparkaws s3
Leverage Partition Paths of your data lake tables to Optimize Data Retrieval Costs on the cloudJanuary 30, 2024 byKrishna Prasadblogapache hudimediumintermediateaws gluecostapache sparkpartition
Data Engineering: Bootstrapping Data lake with Apache HudiJanuary 20, 2024 byKrishna Prasadblogapache hudimediumbeginnerETLaws glueapache sparkaws s3
Learn How to Move Data From MongoDB to Apache Hudi Using PySparkJanuary 20, 2024 bySoumil Shahblogapache hudilinkedinbeginnermongodbapache sparkpyspark
In-House Data Lake with CDC Processing, Hudi, DockerJanuary 11, 2024 byRahulblogapache hudimediumintermediatedockercdcapache kafkadebeziumapache sparkaws s3
Introduction to Apache HudiJanuary 9, 2024 byAndrew Savchynsblogapache hudimediumbeginnerapache spark
From Data lake to Microservices: Unleashing the Power of Apache Hudi's Record Level Index with FastAPI and Spark ConnectJanuary 1, 2024 bySoumil Shahblogapache hudilinkedinbeginnerapache sparkrecord level indexpysparkupsertsFastAPI
Getting started with Apache HudiDecember 1, 2023 byDataCouchapache hudiapache sparkhow-togetting startedmedium
Apache Hudi: From Zero To One (2/10)September 6, 2023 byShiyan Xublogapache hudiqueriesreadsdatumagicapache sparktime travel queryincremental querysnapshot queryread optimized query