Skip to content

Changefeed got stuck due to can not finish incremental scan when many tables share one region. #1181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
asddongmen opened this issue Mar 27, 2025 · 1 comment
Labels
severity/moderate type/bug The issue is confirmed as a bug.

Comments

@asddongmen
Copy link
Collaborator

After creating 10 million empty tables, PD will merge them into only more than 1,000 regions, so that many tables share one region.
Moreover, these regions are located on a small number of TiKV nodes, which will cause the cdc incremental scan to fully utilize the CPU of the TiKV endpoint, and the incremental scan cannot be completed. Which lead to changefeed got stuck.

Image

Image

CDC Version

Release Version: v9.0.0-alpha-185-gb3c5be2a
Git Commit Hash: b3c5be2a5565c111a4babfe10b5b7f3bfb87da74
Git Branch: master
UTC Build Time: 2025-03-27 03:10:59
Go Version: go version go1.23.4 darwin/arm64
Failpoint Build: false

TiKV Version

v8.5.1
@asddongmen asddongmen added severity/moderate type/bug The issue is confirmed as a bug. labels Mar 27, 2025
@asddongmen
Copy link
Collaborator Author

  • Even when I imported 15TB of data and increased the number of regions to 60,000, the problem still persisted.
  • Finally, by reducing the region size to 24MB, the regions were split to 600,000, and the problem was solved. Calculated, there are only 16 tables on average in each region, so that the incremental scan will not fail to complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

1 participant