长期支持版本
我们的目标是维护 0.12 更长时间,并通过最新的 0.12.x 版本提供稳定版本供用户迁移。 此版本 (0.12.2) 是最新的 0.12 版本。
迁移指南
此版本 (0.12.2) 没有引入任何新的表版本,因此如果您使用的是 0.12.0,则无需迁移。
如果从旧版本迁移,请查看之前发行说明中的迁移指南,特别是0.6.0, 0.9.0, 0.10.0, 0.11.0, and 0.12.0.中的升级说明。
bug修复
0.12.2 版本主要用于错误修复和稳定性。 这些修复跨越许多组件,包括
- DeltaStreamer
- 数据类型/模式相关的错误修复
- Table服务
- 元数据表
- Spark SQL
- Presto 稳定性/性能修复
- Trino 稳定性/性能修复
- 元同步
- Flink 引擎
- 单元、功能、集成测试和 CI
Release Notes
Sub-task
- [HUDI-5244] – Fix bugs in schema evolution client with lost operation field and not found schema
Bug
- [HUDI-3453] – Metadata table throws NPE when scheduling compaction plan
- [HUDI-3661] – Flink async compaction is not thread safe when use watermark
- [HUDI-4281] – Using hudi to build a large number of tables in spark on hive causes OOM
- [HUDI-4588] – Ingestion failing if source column is dropped
- [HUDI-4855] – Bootstrap table from Deltastreamer cannot be read in Spark
- [HUDI-4893] – More than 1 splits are created for a single log file for MOR table
- [HUDI-4898] – for mor table, presto/hive shoud respect payload class during merge parquet file and log file
- [HUDI-4901] – Add avro version to Flink profiles
- [HUDI-4946] – merge into with no preCombineField has dup row in only insert
- [HUDI-4952] – Reading from metadata table could fail when there are no completed commits
- [HUDI-4966] – Meta sync throws exception if TimestampBasedKeyGenerator is used to generate partition path containing slashes
- [HUDI-4971] – aws bundle causes class loading issue
- [HUDI-4975] – datahub sync bundle causes class loading issue
- [HUDI-4998] – Inference of META_SYNC_PARTITION_EXTRACTOR_CLASS does not work
- [HUDI-5003] – InLineFileSystem will throw NumberFormatException, cause the type of startOffset is int and out of bounds
- [HUDI-5007] – Prevent Hudi from reading the entire timeline's when performing a LATEST streaming read
- [HUDI-5008] – Avoid unset HoodieROTablePathFilter in IncrementalRelation
- [HUDI-5025] – Rollback failed with log file not found when rollOver in rollback process
- [HUDI-5041] – lock metric register confict error
- [HUDI-5057] – Fix msck repair hudi table
- [HUDI-5058] – The primary key cannot be empty when Flink reads an error from the hudi table
- [HUDI-5061] – bulk insert operation don't throw other exception except IOE Exception
- [HUDI-5063] – totalScantime and other run time stats missing from commit metadata
- [HUDI-5070] – Fix Flaky TestCleaner test : testInsertAndCleanByCommits
- [HUDI-5076] – Non serializable path used with engineContext with metadata table initialization
- [HUDI-5087] – Max value read from metatable incorrect
- [HUDI-5088] – Failed to synchronize the hive metadata of the Flink table
- [HUDI-5092] – Querying Hudi table throws NoSuchMethodError in Databricks runtime
- [HUDI-5096] – boolean param is broken in HiveSyncTool
- [HUDI-5097] – Read 0 records from partitioned table without partition fields in table configs
- [HUDI-5151] – Flink data skipping doesn't work with ClassNotFoundException of InLineFileSystem
- [HUDI-5157] – Duplicate partition path for chained hudi tables.
- [HUDI-5163] – Failure handling w/ spark ds write failures
- [HUDI-5176] – Incremental source may miss commits if there are inflight commits before completed commits
- [HUDI-5185] – Compaction run fails with –hoodieConfigs
- [HUDI-5203] – Debezium payload does not handle null-field cases
- [HUDI-5228] – Flink table service job fs view conf overwrites the one of writing job
- [HUDI-5242] – Do not fail Meta sync in Deltastreamer when inline table service fails
- [HUDI-5251] – Unexpected avro dependency in flink 1.15 bundle
- [HUDI-5253] – HoodieMergeOnReadTableInputFormat could have duplicate records issue if it contains delta files while still splittable
- [HUDI-5260] – Insert into sql with strict insert mode and no preCombineField should not overwrite existing records
- [HUDI-5277] – RunClusteringProcedure can't exit corretly
- [HUDI-5286] – UnsupportedOperationException throws when enabling filesystem retry
- [HUDI-5291] – NPE in collumn stats for null values
- [HUDI-5320] – Spark SQL CTAS does not propagate Table properties to actual SparkSqlWriter
- [HUDI-5325] – Fix Create Table to propagate properly Metadata Table enabling config
- [HUDI-5336] – Fix log file parsing to consider "." at the beginning
- [HUDI-5346] – Fixing performance traps in CTAS
- [HUDI-5347] – Fix Merge Into performance traps
- [HUDI-5350] – oom cause compaction event lost
- [HUDI-5351] – Handle meta fields being disabled in Bulk Insert Partitioners
- [HUDI-5373] – Different fileids are assigned to the same bucket
- [HUDI-5375] – Fix re-using of file readers w/ metadata table in FileIndex
- [HUDI-5393] – Remove the reuse of metadata table writer for flink write client
- [HUDI-5403] – Input Format class has metadata table enabled for file listing unexpectedly by default
- [HUDI-5409] – Avoid file index and use fs view cache in COW input format
- [HUDI-5412] – Send the boostrap event if the JM also rebooted
Improvement
- [HUDI-4526] – improve spillableMapBasePath disk directory is full
- [HUDI-4799] – improve analyzer exception tip when can not resolve expression
- [HUDI-4960] – Upgrade Jetty version for Timeline server
- [HUDI-4980] – Make avg record size calculated based on commit instant only
- [HUDI-4995] – Dependency conflicts on apache http with other projects
- [HUDI-4997] – use jackson-v2 replace jackson-v1 import
- [HUDI-5002] – Remove deprecated API usage in SparkHoodieHBaseIndex#generateStatement
- [HUDI-5027] – Replace hardcoded hbase config keys with HbaseConstants
- [HUDI-5045] – Add tests to integ test to test bulk_insert followed by upsert
- [HUDI-5066] – Support hoodie source metaclient cache for flink planner
- [HUDI-5102] – source operator(monitor and reader) support user uid
- [HUDI-5104] – Add feature flag to disable HoodieFileIndex and fall back to HoodieROTablePathFilter
- [HUDI-5111] – Add metadata on read support to integ tests
- [HUDI-5184] – Remove export PYSPARK_SUBMIT_ARGS="–master local*" from HoodiePySparkQuickstart.py
- [HUDI-5247] – Clean up java client tests
- [HUDI-5296] – Support disabling schema on read if not required
- [HUDI-5338] – Adjust coalesce behavior within "NONE" sort mode for bulk insert
- [HUDI-5344] – Upgrade com.google.protobuf:protobuf-java
- [HUDI-5345] – Avoid fs.exists calls for metadata table in HFileBootstrapIndex
- [HUDI-5348] – Cache file slices within MDT reader
- [HUDI-5357] – Optimize release artifacts' deployment
- [HUDI-5370] – Properly close file handles for Metadata writer
Test
- [HUDI-5383] – Test 0.12.2 release branch
Task
- [HUDI-3287] – Remove unnecessary deps in hudi-kafka-connect
- [HUDI-5081] – Resources clean-up in hudi-utilities tests
- [HUDI-5221] – Make the decision for flink sql bucket index case-insensitive
- [HUDI-5223] – Partial failover for flink
- [HUDI-5227] – Upgrade Jetty to 9.4.48
本文为从大数据到人工智能博主「xiaozhch5」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://cloud.tencent.com/developer/article/2208628