Skip to content

Antalya 26.5: DataLakeCatalog: avoid full catalog read for UNKNOWN_TABLE typo hints#1938

Open
zvonand wants to merge 2 commits into
antalya-26.5from
feature/antalya-26.5/pr-1675
Open

Antalya 26.5: DataLakeCatalog: avoid full catalog read for UNKNOWN_TABLE typo hints#1938
zvonand wants to merge 2 commits into
antalya-26.5from
feature/antalya-26.5/pr-1675

Conversation

@zvonand

@zvonand zvonand commented Jun 23, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Avoid scanning the whole remote data lake catalog for “Maybe you meant …” table hints when show_data_lake_catalogs_in_system_tables is disabled (ClickHouse#100452 by @alsugiliazova).

Cherry-picked from ClickHouse#100452.

--- (#1675 by @zvonand).

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

Cherry-picked from #1675.


Documentation entry for user-facing changes

When show_data_lake_catalogs_in_system_tables = 0, the server must not implicitly scan the whole remote data lake catalog.
Previously, building the “Maybe you meant …” hint for a missing table in a DataLakeCatalog database still called getAllTableNames()DatabaseDataLake::getTablesIterator(), which lists the entire catalog and loads per-table metadata—heavy work and can OOM on large catalogs, only to enrich an error message.

This change makes TableNameHints::getAllRegisteredNames() return an empty name list for data lake catalogs when that setting is off, so hint generation does not trigger a full catalog listing.

Query examples

SET show_data_lake_catalogs_in_system_tables = 0;

Drop non-existent table:

DROP TABLE datalake.`schema1.table1`;

Previously (undesired): server could answer with UNKNOWN_TABLE and a “Maybe you meant …” suggestion after scanning catalog names:

Received exception from server (version 26.3.1):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table datalake.`schema1.table1` does not exist. Maybe you meant datalake.`schema1.table`?. (UNKNOWN_TABLE)

After the fix: UNKNOWN_TABLE without hint when the setting is 0:

Received exception from server (version 26.4.1):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table datalake.`schema1.table1` does not exist. (UNKNOWN_TABLE)

Optional follow-up

If local hints are skipped for data lake + setting 0, getHintForTable can still fall back to getExtendedHintForTable, which only scans non–data-lake databases. In edge cases a suggestion could point at another database’s table. If that is undesirable, a follow-up could skip extended hints under the same condition as local enumeration.

zvonand and others added 2 commits June 24, 2026 00:02
…next commit)

---
Original cherry-pick message follows:

Merge pull request #1675 from Altinity/feature/antalya-26.3/ClickHouse-ClickHouse-pr-100452

Antalya 26.3: DataLakeCatalog: avoid full catalog read for `UNKNOWN_TABLE` typo hints
# Conflicts:
#	src/Interpreters/DatabaseCatalog.cpp
The source PR's change (gate getAllRegisteredNames on
show_data_lake_catalogs_in_system_tables via isDatalakeCatalog) is
already present on antalya-26.5 in generalized superset form: commit
d5f46f6 (backport ClickHouse#104416) renamed isDatalakeCatalog ->
isRemoteDatabase and show_data_lake_catalogs_in_system_tables ->
show_remote_databases_in_system_tables, broadening the skip to all
remote databases (data lake catalogs, MySQL, PostgreSQL). The bucket-2
token-swap of the source PR's lines lands exactly on the existing
antalya-26.5 code, so the conflict resolves to "ours".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zvonand zvonand added releasy Created/managed by RelEasy antalya-26.5 ai-resolved Port conflict auto-resolved by Claude labels Jun 23, 2026
@github-actions

Copy link
Copy Markdown

Workflow [PR], commit [2b87629]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-resolved Port conflict auto-resolved by Claude antalya-26.5 releasy Created/managed by RelEasy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant