Skip to content

Refactor HTTP download logic #13383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ichard26
Copy link
Member

@ichard26 ichard26 commented May 6, 2025

I was never really happy with how the download logic turned out after PR #12991. This is my second attempt at refactoring the downloading logic to be easier to read/follow. The main change is refactoring the Downloader class to use a separate dataclass to store state for individual downloads. This greatly reduces the number of names being passed around in the download logic.

It's easiest this to review by reviewing commit-by-commit.

ichard26 added 6 commits May 5, 2025 20:50
BatchDownloader is a wrapper class over Downloader that is effectively
a glorified for loop. It doesn't need to exist.
The Downloader code is hard to follow as it passes a bunch of variables
across a variety of functions and methods. To deal with this mess,
encapsulate the state needed to download a single link into a separate
_FileDownload class and use that instead.
The value used for conditional range requests should be updated when the
download is restarted (not *resumed*). The unit tests testing
conditional range requests were broken, using code 206 for the
non-matching responses. This violates the HTTP spec and fools the
resumption logic into thinking the conditional range request was
fulfilled (when it wasn't!).

The test cases were fixed to use code 200 as expected and extended
to ensure the content key is updated across a chain of non-matching
repsonses (which is borderline impossible, but hey it did catch my
eye so why not fix it).
@ichard26 ichard26 added type: refactor Refactoring code skip news Does not need a NEWS file entry (eg: trivial changes) labels May 6, 2025
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense to me.

@ichard26
Copy link
Member Author

@gmargaritis if you have time, I'd appreciate a look-over for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news Does not need a NEWS file entry (eg: trivial changes) type: refactor Refactoring code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants