Skip to main content

Roadmap

Hudi community strives to deliver major releases every 2-3 months, while offering minor releases every month! This page captures the forward-looking roadmap of ongoing & upcoming projects and when they are expected to land, broken down by areas on our stack.

H1 2022 Releases

Next major release : 0.11.0 (Apr 2022)

ReleaseTimeline
0.10.1Jan 2022
0.11.0Apr 2022
0.12.0Jun 2022
1.0.0Summer 2022

Transactions/Database Layer

FeatureTarget ReleaseTracking
Space-filling curves hardening & perf improvements0.11HUDI-2100
Metadata table update via multi-table transactions, turned on by default0.11HUDI-1292
Metadata Index, as a bloom index alternative, fetching col_stats and bloom_filters from metadata table, improving upsert performance.0.11HUDI-1822, RFC-37
Support for Encryption0.11HUDI-2370
Schema-on-read for non-backwards compatible schema evolution0.11HUDI-2429
Improvements to merge-on-read log merging/reading with streaming semantics0.11HUDI-3081
Indexed columns support & elimination of partitioning0.11HUDI-512
Record-level index to speed up uuid based upserts/deletes0.12HUDI-53
Eager conflict detection for Optimistic Concurrency Control0.12HUDI-1575
Indexed timeline and infinite retention of versions0.12RFC coming soon
Improvements to streaming read and full CDC data model support0.12HUDI-2749, RFC coming soon
Consistent hashing based file distribution over storage to overcome throttling issues for very large tables0.12RFC published soon
Lock free concurrency control0.12 -> 1.0.0HUDI-3187
Non-blocking/Lock-free updates during clustering0.12 -> 1.0.0HUDI-1042
Time Travel updates, deletes0.12 -> 1.0.0
General purpose support for multi-table transactions0.12 -> 1.0.0

Execution Engine Integration

FeatureTarget ReleaseTracking
Spark SQL DML fixes & enhancements0.11HUDI-1658
Data-skipping for Hive and Spark based on col_stats from metadata table0.11HUDI-1296, RFC-27
Non-keyed tables with updates and deletes0.11HUDI-2968
Trino Connector for Hudi, with read/query support0.12HUDI-2687, RFC-38
Spark Datasource V20.12HUDI-1297 ,HUDI-2531
Complete ORC Support across query engines0.12HUDI-57
Presto Connector for Hudi, with read/query support0.12PRESTO-17006
Multi-Modal indexing full integration across Presto/Trino/Spark queries0.12 -> 1.0.0HUDI-1822
Materialized Views with incremental updates using Flink1.0.0
SQL DML support for Presto/Trino connectors (could be accelerated based on community feedback)1.0.0
Explore other execution engines/runtimes (Ray, native Rust, Python)1.0.0

Platform Services

FeatureTarget ReleaseTracking
Native support for AWS Glue Metastore0.11HUDI-2757
BigQuery and Snowflake external table integration0.12RFC-34
JDBC Incremental Source GA0.12HUDI-1859
Mutable, CDC Stream support for Kafka Connect Sink0.12HUDI-2324
Airbyte integration0.12RFC coming soon
Apache Pulsar integration for Delta Streamer (blocked on upstream)0.12HUDI-246
Kinesis deltastreamer source, with DynamoDB CDC0.12HUDI-1386, HUDI-310
Support for reliable, event based ingestion from cloud stores - GCS, Azure and the others1.0.0HUDI-1896
Hudi Timeline Metaserver for locks, column status and table listings (could be accelerated based on community feedback)1.0.0Strawman design
Mutable, Transactional caching for Hudi Tables (could be accelerated based on community feedback)1.0.0Strawman design