[lake] Support schema evolution for lake-enabled tables with AddColum… #2189

buvb · 2025-12-16T17:03:02Z

Purpose

Linked issue: close #2128

This PR enables schema evolution for datalake-enabled tables, specifically supporting ADD COLUMN ... LAST with NULLABLE columns. When a user executes ALTER TABLE ADD COLUMN on a lake-enabled table, the schema change is first applied to Fluss (source of truth), then synchronized to Paimon.

Brief change log

CoordinatorService: Pass LakeCatalog and LakeCatalogContext to [alterTableSchema()]for Paimon synchronization
MetadataManager: Add [syncSchemaChangesToLake()] to sync schema changes to Paimon after Fluss schema update; skip schema registration if schema unchanged (retry idempotency)
SchemaUpdate: Support idempotent [addColumn()] - if column already exists with same type and comment, treat as no-op
PaimonLakeCatalog: Handle ColumnAlreadyExistException as idempotent success for retry scenarios
PaimonConversions: Map Fluss AddColumn to Paimon SchemaChange, inserting new column before system columns
FlussRecordAsPaimonRow: Handle tiering transition period when Fluss record is wider than Paimon schema using min(internalRow.getFieldCount(), businessFieldCount)

Tests

LakeEnabledTableCreateITCase#testAlterLakeEnabledTableSchema - Verify ADD COLUMN syncs to Paimon with correct type and comment
FlussRecordAsPaimonRowTest#testFlussRecordWiderThanPaimonSchema - Verify tiering doesn't crash when Fluss record has more fields than Paimon schema
FlussRecordAsPaimonRowTest#testPaimonSchemaWiderThanFlussRecord - Verify padding NULL for missing fields

API and Format

No API or storage format changes.

Documentation

No documentation changes required for this MVP. Future documentation may be needed when more schema evolution operations are supported.

…n operation (apache#2128)

loserwang1024

@buvb Thanks for your great job. I have left some adive.

Would you like to add test in PaimonTieringITCase or PaimonTieringTest for a e2e test (just like what PaimonTieringITCase#testTieringForAlterTable does). Add coumn before and bewteen tier service.

...ke-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/FlussRecordAsPaimonRowTest.java

...s-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/FlussRecordAsPaimonRow.java

loserwang1024 · 2025-12-17T03:04:30Z

...s-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/FlussRecordAsPaimonRow.java

    private final int bucket;
    private LogRecord logRecord;
    private int originRowFieldCount;
+    private final int businessFieldCount;


maybe dataFieldCount?

I prefer businessFieldCount because it clearly distinguishes from system columns (__bucket, __offset, __timestamp). dataFieldCount might be confused with total field count.

loserwang1024 · 2025-12-17T03:23:22Z

fluss-server/src/main/java/org/apache/fluss/server/coordinator/MetadataManager.java

-                // update the schema
-                zookeeperClient.registerSchema(tablePath, newSchema, table.getSchemaId() + 1);
+                // update the schema in Fluss (ZK) first - Fluss is the source of truth
+                if (!newSchema.equals(table.getSchema())) {


It better check whether lakehouse can do it before zookeeperClient.registerSchema.

check that fluss can apply schema change.
SchemaUpdate.applySchemaChanges(table, schemaChanges)

Check whether the schema can be apply to lake.

zookeeperClient.registerSchema(tablePath, newSchema, table.getSchemaId() + 1). This can be rarely happen because only zookeeper.

syncSchemaChangesToLake.
For example, currently, iceberg catalog is not support add column. If we zookeeperClient.registerSchema successfully, problem will occur.

Also a problem left is what if zookeeperClient.registerSchema success but syncSchemaChangesToLake fails? A assumed that:

Add column A

zookeeperClient.registerSchema success

syncSchemaChangesToLake fails

Add column B

zookeeperClient.registerSchema success

syncSchemaChangesToLake success

ReAdd column A

SchemaUpdate.applySchemaChanges is Idempotent

syncSchemaChangesToLake success,

Finally, Fluss Table is : column A, column B
Finally, Lake Table is : column B, column A.

I advice each time before zookeeperClient.registerSchema, check whether the column number is same:

Add column A

zookeeperClient.registerSchema success

syncSchemaChangesToLake fails

Add column B

Columns number is not same, just refuse it.

@wuchong , WDYT?

ZK success but Lake sync fails
This is already handled by the idempotent design:

[SchemaUpdate.addColumn()] - treats existing column with same type/comment as no-op

[PaimonLakeCatalog.alterTable()] - catches ColumnAlreadyExistException as success

So when user retries, both Fluss and Paimon will handle it gracefully.

@buvb Yes, If no other add column is happened, it will handle it gracefully. However, as I mention before:

Aassumed that:

Add column A

zookeeperClient.registerSchema success

syncSchemaChangesToLake fails

Add column B

zookeeperClient.registerSchema success

syncSchemaChangesToLake success

ReAdd column A

SchemaUpdate.applySchemaChanges is Idempotent

syncSchemaChangesToLake success.

Finally, Fluss Table is : column A, column B
Finally, Lake Table is : column B, column A.

I have been talked with jark. He advice to add column in lakehouse before fluss. Maybe each time before apply schema change to paimon, we can get the paimon rowType, and check whether is safe.

RowType rowType = paimonCatalog.getTable(new Identifier(tablePath.getDatabaseName(), tablePath.getTableName())).rowType();

fluss-server/src/main/java/org/apache/fluss/server/coordinator/SchemaUpdate.java

loserwang1024 · 2025-12-23T02:16:12Z

@luoyuxia @wuchong , Would you like to help a final Review?

buvb · 2025-12-23T03:04:19Z

@luoyuxia @wuchong The main changes are as follows

Implemented lake-first schema sync, idempotent AddColumn handling, Paimon addColumn mapping, and tiering safeguards (fail when Fluss is wider, pad NULL when Paimon is wider), plus IT/unit coverage.
Recent fixes: PaimonTieringITCase now writes values for the new column after ADD COLUMN and asserts actual column names; offset check uses a dynamic index to avoid type mismatches after schema changes.
Testing: In an environment that allows binding ports, please run ./mvnw -pl fluss-lake/fluss-lake-paimon -DskipTests=false test. In restricted environments, ReCreateSameTableAfterTieringTest fails due to port bind being denied; all other tests pass.

wuchong

Could you please rebase your branch to trigger a fresh CI run? The base branch is quite outdated, and running against the latest changes will help uncover any potential issues that might otherwise be hidden.

wuchong · 2025-12-25T11:35:38Z

...ke-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/FlussRecordAsPaimonRowTest.java

+        assertThat(flussRecordAsPaimonRow.isNullAt(1)).isTrue();
+        assertThat(flussRecordAsPaimonRow.getInt(2)).isEqualTo(tableBucket);
+        assertThat(flussRecordAsPaimonRow.getLong(3)).isEqualTo(logOffset);
+        assertThat(flussRecordAsPaimonRow.getLong(4)).isEqualTo(timeStamp);


should assert getTimestamp because this is a timestamp type.

wuchong · 2025-12-25T11:51:23Z

fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/PaimonLakeCatalog.java

-        } catch (Catalog.ColumnAlreadyExistException | Catalog.ColumnNotExistException e) {
-            // shouldn't happen before we support schema change
+        } catch (Catalog.ColumnAlreadyExistException e) {
+            // Column already exists, treat as idempotent success for retry scenarios.


Given that we may execute multiple TableChange operations in a single statement (e.g., adding several columns at once), blindly ignoring ColumnAlreadyExistException could silently skip the addition of some columns, leading to an incomplete schema update.

A simpler and safer approach, in my view, is to compare the current Paimon table schema with the expected target schema before performing any ALTER TABLE (like how you did in org.apache.fluss.server.coordinator.MetadataManager#alterTableSchema):

If the schemas differ, proceed with the ALTER TABLE and report any errors faithfully to the user (who can then re-execute if needed).

If the schemas already match, log a clear message (e.g., “Column(s) already exist—skipping ALTER TABLE”) and skip the operation.

This ensures correctness, avoids silent failures, and provides transparent feedback to the user.

wuchong · 2025-12-25T11:51:46Z

fluss-server/src/main/java/org/apache/fluss/server/coordinator/MetadataManager.java

+                // Update Fluss schema (ZK) after Lake sync succeeds
+                if (!newSchema.equals(table.getSchema())) {
+                    zookeeperClient.registerSchema(tablePath, newSchema, table.getSchemaId() + 1);
+                }


log a clear message (e.g., “Column(s) already exist—skipping ALTER TABLE”) and skip the operation.

wuchong · 2025-12-25T11:55:20Z

fluss-server/src/main/java/org/apache/fluss/server/coordinator/SchemaUpdate.java

+        if (existingColumn != null) {
+            // Allow idempotent retries: if column name/type/comment match existing, treat as no-op
+            if (!existingColumn.getDataType().equals(addColumn.getDataType())
+                    || !Objects.equals(existingColumn.getComment(), addColumn.getComment())) {


Suggested change

|| !Objects.equals(existingColumn.getComment(), addColumn.getComment())) {

|| !Objects.equals(existingColumn.getComment().orElse(null), addColumn.getComment())) {

The return type of existingColumn.getComment() is Optional<String> which is not comparable with addColumn.getComment()

wuchong · 2025-12-25T12:08:28Z

...luss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/PaimonTieringITCase.java

+            List<InternalRow> allRows = new ArrayList<>();
+            allRows.addAll(initialRows);
+            allRows.addAll(newRows);
+            checkDataInPaimonAppendOnlyTable(tablePath, allRows, 0);


I suggest constructing an expected list of rows that includes all six columns (the new column, original user columns, and relevant system columns), and then asserting that this expected list exactly matches the rows read from Paimon. This will ensure comprehensive validation of both schema evolution and data correctness.

wuchong · 2025-12-25T12:08:52Z

...luss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/PaimonTieringITCase.java

+            List<String> fieldNames = paimonTable.rowType().getFieldNames();
+
+            // Should have: a, b, c3, __bucket, __offset, __timestamp
+            assertThat(fieldNames).contains("a", "b", "c3");


assert exactly all the field names and order.

buvb added 2 commits December 16, 2025 23:42

[lake] Support schema evolution for lake-enabled tables with AddColum…

912ea13

…n operation (apache#2128)

[lake] Fix checkstyle violations in FlussRecordAsPaimonRowTest

31fdaa2

buvb mentioned this pull request Dec 17, 2025

Support Add Column at Last for DataLake Enabled Tables #2070

Open

2 tasks

loserwang1024 reviewed Dec 17, 2025

View reviewed changes

fluss-server/src/main/java/org/apache/fluss/server/coordinator/SchemaUpdate.java Show resolved Hide resolved

buvb added 4 commits December 22, 2025 18:21

Improve lake-first schema evolution safety

58ee7dc

Fix tiering add-column IT to supply new column values

1e07f3f

Adjust tiering add-column IT to expect actual column names

2d73068

Handle dynamic offset index after adding columns in tiering IT

6c22ad2

wuchong reviewed Dec 25, 2025

View reviewed changes

	\|\| !Objects.equals(existingColumn.getComment(), addColumn.getComment())) {
	\|\| !Objects.equals(existingColumn.getComment().orElse(null), addColumn.getComment())) {

[lake] Support schema evolution for lake-enabled tables with AddColum… #2189

Are you sure you want to change the base?

[lake] Support schema evolution for lake-enabled tables with AddColum… #2189

Uh oh!

Conversation

buvb commented Dec 16, 2025

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

loserwang1024 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loserwang1024 Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loserwang1024 Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

loserwang1024 commented Dec 23, 2025

Uh oh!

buvb commented Dec 23, 2025

Uh oh!

wuchong left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

loserwang1024 left a comment •

edited

Loading

loserwang1024 Dec 17, 2025 •

edited

Loading

loserwang1024 Dec 17, 2025 •

edited

Loading