Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration took tool long time #17087

Closed
5 tasks done
wanghan5 opened this issue Mar 29, 2025 · 2 comments · Fixed by #17088
Closed
5 tasks done

Migration took tool long time #17087

wanghan5 opened this issue Mar 29, 2025 · 2 comments · Fixed by #17088
Labels
🐞 bug Something isn't working

Comments

@wanghan5
Copy link
Contributor

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.15.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

The migrations took more than 20 minutes when updating embeddings table with 80 thousand items.

✔️ Expected Behavior

less then 10 minutes

❌ Actual Behavior

it took more than 20 minutes

@dosubot dosubot bot added the 🐞 bug Something isn't working label Mar 29, 2025
@wanghan5 wanghan5 mentioned this issue Mar 29, 2025
5 tasks
Copy link

dosubot bot commented Mar 29, 2025

To address the issue of long migration times when updating the embeddings table in Dify, consider the following potential optimizations:

  1. Database Indexing: Ensure that the necessary indexes are in place for the columns involved in the migration, particularly the provider_name column, to speed up the update operations.

  2. Batch Processing: If the migration script processes records one by one, consider modifying it to process records in batches. This can reduce the overhead of individual transactions and improve performance.

  3. Database Configuration: Review and optimize your database configuration settings, such as increasing the SQLALCHEMY_POOL_SIZE to allow more concurrent connections, which can help speed up the migration process [1].

  4. Hardware Resources: Ensure that your server has sufficient CPU and RAM resources. The recommended minimum is 2 CPU cores and 4GB of RAM, but more resources may be needed for large datasets [1].

  5. Logging and Monitoring: Implement logging and monitoring to identify bottlenecks during the migration process. This can help pinpoint specific areas that need optimization.

These steps can help reduce the time taken for database migrations. If the issue persists, further investigation into the specific migration script and database performance may be necessary.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@wanghan5
Copy link
Contributor Author

To address the issue of long migration times when updating the embeddings table in Dify, consider the following potential optimizations:

  1. Database Indexing: Ensure that the necessary indexes are in place for the columns involved in the migration, particularly the provider_name column, to speed up the update operations.
  2. Batch Processing: If the migration script processes records one by one, consider modifying it to process records in batches. This can reduce the overhead of individual transactions and improve performance.
  3. Database Configuration: Review and optimize your database configuration settings, such as increasing the SQLALCHEMY_POOL_SIZE to allow more concurrent connections, which can help speed up the migration process [1].
  4. Hardware Resources: Ensure that your server has sufficient CPU and RAM resources. The recommended minimum is 2 CPU cores and 4GB of RAM, but more resources may be needed for large datasets [1].
  5. Logging and Monitoring: Implement logging and monitoring to identify bottlenecks during the migration process. This can help pinpoint specific areas that need optimization.

These steps can help reduce the time taken for database migrations. If the issue persists, further investigation into the specific migration script and database performance may be necessary.

To continue talking to Dosu, mention @dosu.

Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

The migration sql with " {provider_column_name} not like '%/%'" make the provider_name index useless

@crazywoola crazywoola closed this as not planned Won't fix, can't repro, duplicate, stale Mar 29, 2025
@crazywoola crazywoola reopened this Mar 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants