Skip to content

[DO-NOT-MERGE] populate requirement.specifier on demand #13323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ichard26
Copy link
Member

@ichard26 ichard26 commented Apr 7, 2025

I was looking at the distribution discovery performance improvements of #12308 and I realized that ~1/3 of packaging Requirement construction is spent on creating the specifier (SpecifierSet) attribute. It's expensive as each of the contained specifiers have to be hashed immediately.

While the specifier attribute is accessed frequently, not all operations need it. For example, pip show six sped up by ~20ms with 114 packages installed if modify packaging to populate the specifier attribute on-demand.

@notatallshaw I'm curious to whether this has any impact on dependency resolution. While any requirement considered during dependency resolution is obviously going to require the the version specifier, there are a lot of requirements that aren't ever considered. Typically these are all of the requirements from extras.

For example, if I further modify packaging to keep track of how many Requirement objects are constructed and how many have their specifier attribute materialize, I get 23/180 for pip install --dry-run --ignore-installed black pytest mypy nox.

If you have time, I'd be interested in whether this results in a measurable improvement.

@ichard26 ichard26 added the skip news Does not need a NEWS file entry (eg: trivial changes) label Apr 7, 2025
@ichard26
Copy link
Member Author

ichard26 commented Apr 7, 2025

Cross-referencing the apache-airflow[all] numbers from #12663 (comment), this may save fractions of a second 🙃

@notatallshaw
Copy link
Member

notatallshaw commented Apr 7, 2025

Very quickly, a bit of benchmarking for the classic apache-airflow[all] I saw a 1-1.5% performance improvement, and it was statistically significant, running:

hyperfine --warmup 1 --runs 5 "pip install --dry-run --ignore-installed --progress-bar off --disable-pip-version-check apache-airflow[all]"

I can do some more profiling if you would like? But it seems so simple that I would say let's just add it?

The only downside I see is that it means specifiers don't eagerly fail if they are invalid, but I don't see that as being a general issue, if they are relevant they will eventually be evaluated.

I wonder also about self.marker?

@ichard26
Copy link
Member Author

ichard26 commented Apr 7, 2025

But it seems so simple that I would say let's just add it?

Well, this is packaging's code I'm modifying, so I'd have to submit a PR to packaging and justify it to the maintainers. I don't think raising the invalid specifier error on read is an acceptable change. However, the requirement parser does seem to reject (some?) invalid specifiers, so perhaps it could be okay?

@notatallshaw
Copy link
Member

Well, this is packaging's code I'm modifying, so I'd have to submit a PR to packaging and justify it to the maintainers.

Oh yeah, I read over the PR changes too quickly.

I don't think raising the invalid specifier error on read is an acceptable change. However, the requirement parser does seem to reject (some?) invalid specifiers, so perhaps it could be okay?

My understanding of packaging’s maintainers attitude towards this kind of change is it’s unlikely to be accepted by packaging, there's been long discussions about markers being short curcuited and there hasn't been any consensus.

And I can see that for certain tooling you would want to eagerly reject invalid specifiers passed to the requirements object.
This is a good example of where having a single canonical implementation of the spec doesn't always work out that well, as different use cases would benefit from different implementation details.

You can always raise a PR to see what they think.

@ichard26
Copy link
Member Author

ichard26 commented Apr 7, 2025

Closing as this is unlikely to be acceptable to upstream to packaging.

@ichard26 ichard26 closed this Apr 7, 2025
@ichard26 ichard26 deleted the experiment branch April 7, 2025 03:42
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
skip news Does not need a NEWS file entry (eg: trivial changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants