-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] Ecosystem regression checks? #4920
Comments
Meson does something slightly similar. In our case, we publish release candidates once every few months and get them packaged in the experimental branch of multiple linux distros, then ask distribution integrators to test a mass build of all packaged software and provide us (okay, I really mean "me" 😦) the results of the builds to compare. Generally any widespread issue will be caught by this and fixed in time for the final release. It has saved us from getting egg on our faces, a number of times... |
Might be worth following the pattern from https://github.com/pypa/setuptools/blob/main/.github/workflows/ci-sage.yml or PyCA's “downstream” concept too? |
Also, did you mean to link https://github.com/hauntsaninja/mypy_primer? |
Thank you very much @hauntsaninja, I think that this is welcome, specially if there are volunteers to carry the implementation out. However, we would need to be very careful so that something like this would not be hold against the maintainers1. It is good to take informed decisions, but if there is no maintainer willing to take care of a specific functionality, this functionality will end up being removed. In the end of the day informed decisions are only informed decisions, not a compromise that the maintainers will never introduce breaking changes. So ideally it would be interesting to have a system similar to the one you described but that also proactively warns other packages in the ecosystem about upcoming non backwards compatible changes. That would be fantastic because right now the capacity that setuptools has for communicating breaking changes (or even useful information like https://github.com/pypa/setuptools/blob/v78.1.0/setuptools/command/editable_wheel.py#L489-L493 and https://github.com/pypa/setuptools/blob/v78.1.0/setuptools/command/editable_wheel.py#L555-L556) is seriously limited by the frontends hiding the warnings1.
Let's not oversimplify the situation. I invite everyone that want to know the details to study the history of the deprecation warning, why it was introduced, how there has been subsequent problems with the implementation, the extra cost involved in deciding when to replace Footnotes |
To be honest |
With Meson this is "easy" because we upload Release Candidates once a week for usually ~3 weeks before the final release, and this gets picked up by Gentoo. We have a special, long-term arrangement that Gentoo's distribution-wide continuous integration enables Release Candidates of Meson in 25% of all runs. Given a week or two of Release Candidate testing with hundreds of packages that depend on meson, issues quickly show up and get reported back upstream. With setuptools there are thousands of packages. Prereleases are great for this sort of thing, because they are relatively speaking extremely easy to test in a coordinated way (for pypi you can simply have dedicated CI jobs that install all your dependencies with --pre) while still being opt-in and not breaking production or "required CI jobs". But you do have to commit to releasing on a schedule instead of simply whenever you make an exciting new change. |
FWIW, pip used to do pre-releases but dropped that because nobody was testing them and so it seemed pointless. It can be a powerful tool but requires someone to actually use it. I think that a primer-style check might be more useful, running in PRs. |
Interestingly, ci-sage could've caught this if it wasn't left unmaintained: https://github.com/pypa/setuptools/actions/runs/14147225459/job/39636009893?pr=4875#step:11:3914. But if @mkoeppe dropped the ball, it might be a reason for deleting it. |
I mentioned PyCA earlier but didn't link what they do exactly. Here's what they do: https://github.com/pyca/cryptography/blob/30d6698/.github/workflows/ci.yml#L356-L411 / https://github.com/pyca/cryptography/tree/30d6698/.github/downstream.d. |
I've opened #4929 to update this workflow |
I believe pre-releases are beneficial and can be quite effective in certain scenarios1. I've experimented with them in the past, but currently, the setuptools workflows CI are not very compatible with this practice. But can we tell if any "opt-in" kind of solution would be more effective? For example, right now users can opt-in to convert warnings into errors in a non blocking CI pipeline for "monitoring purposes", but the adoption of this practice is not generalised. Would pre-releases face similar challenges? Additionally, the release cadence of setuptools somehow "makes sense" to me now. I have never worked before in a project that uses the same methodology, so when I first joined it felt very unfamiliar. But after have worked for a while I can see the benefits: it is precisely the fact that setuptools releases smaller batches of changes that allow us to quickly track and fix major incidents (with same-day patch releases). If we were bundling changes together, this would not be an easy task, as successive changes inside the same batch tend to intertwine and become co-dependent. We can discuss adding support for pre-releases to the setuptools CI tooling in a separate issue or PR (I think it would be a quality of life improvement), but I don't think they are the ultimate solution to the problem. I like the idea of having separated regression checks and also the option to automatically open issues in repositories prompting for changes. I think that has the potential of being quite effective. Footnotes
|
@abravalheri I did talk about how Meson handles this exact concern. |
In the typing world, we also used to face a problem of causing regressions for users with larger impact than we expected. One of the ways we mostly resolved this was by putting ecosystem impact analyses into the CI of various typing related projects, for instance, typeshed, mypy, pyright, etc
Given issues like #4910 or from a few months ago #4519 and a few others beyond that, it appears setuptools sometimes releases changes that have a broader impact than expected. Maybe ecosystem checks would be useful here too?
This could look like scripts that attempt to build/install a large number of third party packages.
Data-driven estimates help us quantify the benefit of the change compared to the impact e.g. if we knew it would break XYZ number of projects, we might decide enforcing an underscore vs hyphen isn't worth it. Backward compat in build tools is especially useful given the role it plays in reproducibility (Python is used in a lot of science!), so excited to explore the space of better quantifying backward compat concerns.
Thanks for everything!
The text was updated successfully, but these errors were encountered: