-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Error uploading big files #4069
Comments
How does your deployment look like? Is there a reverse proxy with e.g. a timeout in front of it? This looks more like a "bug" in the specific deployment setup, not necessarily in the app itself. Another question would be, what kind of file is this? If a compressed archive, I'd suggest extracting it first before uploading it anyways. |
No, it is a local network, offline deployment. No reverse proxy, no firewall or other systems between. The files are uncompressed in CSV format. Even if the upload succeeds, and aleph is processing, after it is finished no files/data shows up. If I upload e.g. 500 lines of the big CSV files, the data do show up in the ui. Can I somehow debug this? |
I am having a similar issue and had to flush the queue to get anything to process. It was like things were stuck. I'm wondering if there is a limit somewhere. Also cannot generate entities from CSV files that are large either. I'm experimenting to see if I can get a smaller file of the same format to actually show up and allow entity generation. We know large files can be supported if you look at the OCCRP instance as they have some pretty large CSV files that work just fine. what we don't know or have is any information on how to tune these instances to process larger files more efficiently. I have upped the instance size many times with no change and resources are not pegging so there has to be some limits somewhere else in the platform or the components that could be tweaked for performance. Tired of seeing this screen where it never populates especitally since that's where the value of this platform lies in the entity extraction. The columns never load and it's not really all that big of a file. |
Greetings. While uploading a big document (2 GB CSV), the ingest stage actually crashes because of the following error:
In fact, when I upload the document and check the RabbitMQ queue in the management UI, I can actually confirm that there is a unacked message in the queue. With that being said, the error I am getting might be related to the error mentioned in this issue. So there is actually two solutions:
|
Describe the bug
I deployed a productive instance of aleph. Specs: Ubuntu Server 22.04, 2 TB Space, 128 GB RAM and 8 Core CPU. I installed the latest release: 4.0.2. I am able to upload files up to 15 GB via UI and alephclient. Files which are bigger then 15 GB are not finishing the upload phase. Files which are bigger then 50 GB are leading to an never ending upload loop. No error at all.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Show a reason why the upload went wrong. Show a hint before uploading "file is to big" or somethin like that. Show a hint what to do as an admin to enable the instance to handle such files.
Aleph version
4.0.2
Additional context
I checked the documentation, but found nothing helpfull. Is there any chance to change a environment variable, or a setting to enable my aleph instace to handle such big files?
The text was updated successfully, but these errors were encountered: