Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Ecosystem regression checks? #4920

Open
hauntsaninja opened this issue Mar 25, 2025 · 12 comments
Open

[FR] Ecosystem regression checks? #4920

hauntsaninja opened this issue Mar 25, 2025 · 12 comments
Labels
enhancement Long Term Needs Discussion Issues where the implementation still needs to be discussed. proposal

Comments

@hauntsaninja
Copy link
Contributor

In the typing world, we also used to face a problem of causing regressions for users with larger impact than we expected. One of the ways we mostly resolved this was by putting ecosystem impact analyses into the CI of various typing related projects, for instance, typeshed, mypy, pyright, etc

Given issues like #4910 or from a few months ago #4519 and a few others beyond that, it appears setuptools sometimes releases changes that have a broader impact than expected. Maybe ecosystem checks would be useful here too?

This could look like scripts that attempt to build/install a large number of third party packages.

Data-driven estimates help us quantify the benefit of the change compared to the impact e.g. if we knew it would break XYZ number of projects, we might decide enforcing an underscore vs hyphen isn't worth it. Backward compat in build tools is especially useful given the role it plays in reproducibility (Python is used in a lot of science!), so excited to explore the space of better quantifying backward compat concerns.

Thanks for everything!

@eli-schwartz
Copy link
Contributor

Meson does something slightly similar. In our case, we publish release candidates once every few months and get them packaged in the experimental branch of multiple linux distros, then ask distribution integrators to test a mass build of all packaged software and provide us (okay, I really mean "me" 😦) the results of the builds to compare. Generally any widespread issue will be caught by this and fixed in time for the final release.

It has saved us from getting egg on our faces, a number of times...

@webknjaz
Copy link
Member

Might be worth following the pattern from https://github.com/pypa/setuptools/blob/main/.github/workflows/ci-sage.yml or PyCA's “downstream” concept too?

@webknjaz
Copy link
Member

Also, did you mean to link https://github.com/hauntsaninja/mypy_primer?

@webknjaz webknjaz added enhancement proposal Needs Discussion Issues where the implementation still needs to be discussed. Long Term labels Mar 26, 2025
@abravalheri
Copy link
Contributor

abravalheri commented Mar 27, 2025

Thank you very much @hauntsaninja, I think that this is welcome, specially if there are volunteers to carry the implementation out.

However, we would need to be very careful so that something like this would not be hold against the maintainers1. It is good to take informed decisions, but if there is no maintainer willing to take care of a specific functionality, this functionality will end up being removed. In the end of the day informed decisions are only informed decisions, not a compromise that the maintainers will never introduce breaking changes.

So ideally it would be interesting to have a system similar to the one you described but that also proactively warns other packages in the ecosystem about upcoming non backwards compatible changes. That would be fantastic because right now the capacity that setuptools has for communicating breaking changes (or even useful information like https://github.com/pypa/setuptools/blob/v78.1.0/setuptools/command/editable_wheel.py#L489-L493 and https://github.com/pypa/setuptools/blob/v78.1.0/setuptools/command/editable_wheel.py#L555-L556) is seriously limited by the frontends hiding the warnings1.


we might decide enforcing an underscore vs hyphen isn't worth it.

Let's not oversimplify the situation. I invite everyone that want to know the details to study the history of the deprecation warning, why it was introduced, how there has been subsequent problems with the implementation, the extra cost involved in deciding when to replace - with _ and the status of the codebase. I also believe that the implementation existing before v78 had some oversights in it (some were fixed in v78, some were not because I did not want to introduce more complexity - I was also hopping that follow up changes after the removal would be able to further simplify the code base). Overall it looks like high maintenance to me.

Footnotes

  1. The recent heated discussion contains examples of this kind of toxic argumentation. 2

@abravalheri
Copy link
Contributor

abravalheri commented Mar 27, 2025

Might be worth following the pattern from https://github.com/pypa/setuptools/blob/main/.github/workflows/ci-sage.yml or PyCA's “downstream” concept too?

To be honest ci-sage.yml is something that does not work in my opinion. It is too complex (and I don't have the energy to maintain it), too slow, and very often there is something wrong happening with it which is difficult to interpret if it was actually caused by setuptools. In practice it is too maintenance intensive and often ignored.

@eli-schwartz
Copy link
Contributor

With Meson this is "easy" because we upload Release Candidates once a week for usually ~3 weeks before the final release, and this gets picked up by Gentoo. We have a special, long-term arrangement that Gentoo's distribution-wide continuous integration enables Release Candidates of Meson in 25% of all runs.

Given a week or two of Release Candidate testing with hundreds of packages that depend on meson, issues quickly show up and get reported back upstream. With setuptools there are thousands of packages.

Prereleases are great for this sort of thing, because they are relatively speaking extremely easy to test in a coordinated way (for pypi you can simply have dedicated CI jobs that install all your dependencies with --pre) while still being opt-in and not breaking production or "required CI jobs". But you do have to commit to releasing on a schedule instead of simply whenever you make an exciting new change.

@webknjaz
Copy link
Member

FWIW, pip used to do pre-releases but dropped that because nobody was testing them and so it seemed pointless. It can be a powerful tool but requires someone to actually use it. I think that a primer-style check might be more useful, running in PRs.

@webknjaz
Copy link
Member

Interestingly, ci-sage could've caught this if it wasn't left unmaintained: https://github.com/pypa/setuptools/actions/runs/14147225459/job/39636009893?pr=4875#step:11:3914. But if @mkoeppe dropped the ball, it might be a reason for deleting it.

@webknjaz
Copy link
Member

I mentioned PyCA earlier but didn't link what they do exactly. Here's what they do: https://github.com/pyca/cryptography/blob/30d6698/.github/workflows/ci.yml#L356-L411 / https://github.com/pyca/cryptography/tree/30d6698/.github/downstream.d.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 30, 2025

ci-sage could've caught this if it wasn't left unmaintained: https://github.com/pypa/setuptools/actions/runs/14147225459/job/39636009893?pr=4875#step:11:3914

I've opened #4929 to update this workflow

@abravalheri
Copy link
Contributor

I believe pre-releases are beneficial and can be quite effective in certain scenarios1. I've experimented with them in the past, but currently, the setuptools workflows CI are not very compatible with this practice.

But can we tell if any "opt-in" kind of solution would be more effective? For example, right now users can opt-in to convert warnings into errors in a non blocking CI pipeline for "monitoring purposes", but the adoption of this practice is not generalised. Would pre-releases face similar challenges?

Additionally, the release cadence of setuptools somehow "makes sense" to me now. I have never worked before in a project that uses the same methodology, so when I first joined it felt very unfamiliar. But after have worked for a while I can see the benefits: it is precisely the fact that setuptools releases smaller batches of changes that allow us to quickly track and fix major incidents (with same-day patch releases). If we were bundling changes together, this would not be an easy task, as successive changes inside the same batch tend to intertwine and become co-dependent.

We can discuss adding support for pre-releases to the setuptools CI tooling in a separate issue or PR (I think it would be a quality of life improvement), but I don't think they are the ultimate solution to the problem.

I like the idea of having separated regression checks and also the option to automatically open issues in repositories prompting for changes. I think that has the potential of being quite effective.

Footnotes

  1. For example, it is something that I would have liked to do for the PEP 639 implementation.

@eli-schwartz
Copy link
Contributor

But can we tell if any "opt-in" kind of solution would be more effective? For example, right now users can opt-in to convert warnings into errors in a non blocking CI pipeline for "monitoring purposes", but the adoption of this practice is not generalised. Would pre-releases face similar challenges?

@abravalheri I did talk about how Meson handles this exact concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Long Term Needs Discussion Issues where the implementation still needs to be discussed. proposal
Projects
None yet
Development

No branches or pull requests

5 participants