-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Evolution strategies #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evolution strategies #423
Conversation
|
||
We should make the facilities of this approach available to user code, by | ||
allowing a package to expose automigration tools that will be transparently | ||
applied to its dependents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowing users access to writing these tools runs the significant risk of them producing versions that are not sufficiently correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. How much should we be worried about that? In some sense, this is "your dependencies can release a new version that breaks you", which I think will be the case regardless, but it does seem like the problem has a different character given that they can break anything in the dependent projects by performing completely arbitrary rewrites.
There's also the aspect that people will be building, executing, and deploying code that literally no-one has ever code-reviewed. I don't think that's abnormal, either -- there are lots of systems that generate code that (most of the time) no-one and nothing looks at the output of other than a compiler -- but again this would be happening at a larger scale than is common.
proposals/p0423.md
Outdated
migration tool, which may be surprising when relating diagnostics or | ||
behavior back to the original source of an un-migrated package. For example, | ||
source snippets in diagnostics may refer to code that doesn't match the | ||
original source, and debug information may refer to generatd files instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: s/generatd/generated/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
proposals/p0423.md
Outdated
without changing the meaning of any program already in the set. For example, | ||
this might include recognizing a new token that was previously invalid. | ||
- A _removal_, that strictly decreases the set of valid input programs, | ||
without changing the meaening of any program in the set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/meaening/meaning/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
between options that otherwise provide similar value, we should prefer using | ||
more expensive migration strategies over selecting an inferior end state. | ||
|
||
When a choice of strategies is available, purely additive changes should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that I believe that purely additive changes should be preferred. There is value in having a small core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent is that this only applies subject to the "primary driver is the intended end state" above: only when the choices are largely equal on other merits should we consider this factor. I think I can express that more clearly.
proposals/p0423.md
Outdated
|
||
### Non-strategy: simultaneous migration | ||
|
||
A number of strategies that require making simultaneous chanegs to multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/chanegs/changes/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
strategy. Therefore there is no requirement to reserve any lexical space to | ||
prepare for future changes. | ||
|
||
Therefore, we will no longer require whitespace after the `//` introducing a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about //*
or //-
? Do we not need to leave that open as potentially new operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to add such an operator, we could migrate all existing uses of //*
or //-
that introduce a comment to add whitespace after the //
, as a point change. (I think we'd probably want to do something smarter, like looking for the enclosing sequence of consecutive comment lines and adding whitespace after the comment introducer across all of them. But in any case I think this can be handled as a point change.)
I think, broadly, if we can model an anticipated direction of evolution as point changes, we shouldn't try to guess what changes we'll want to make, because the cost of making those changes is sufficiently small. (For example, let's not proactively reserve a bunch of words that we think might be keywords, if we think the cost of reclaiming an identifier as a keyword is small.) If, on the other hand, an anticipated direction of evolution would require an incremental migration in response to changes, then we should be thinking about how to make such future changes easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think I may believe a little more than you that we should encourage reserving lexical space so that more future changes can be purely additive changes, even if we choose not to pursue them.
Right now it may not be worth guessing what changes we'll want to make -- Carbon is small, and if we added a token //*
probably nothing would be broken, so we wouldn't really make a tool. However, as Carbon grows, those costs shift -- I think reserving lexical space is going to be cheaper than writing and running migrations (note this is also a burden to users who need to update their code). Thus point changes actually have a more significant cost long-term, pushing more for reserving lexical space.
So yeah, right now, don't reserve //*
. If Carbon goes public and we still haven't really decided, reserve //*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I think I'd got too anchored to point changes being substantially cheaper than incremental changes and I'd lost sight of additive changes being substantially cheaper than point changes (a point change still churns the entire Carbon ecosystem as the migration tool is applied, in addition to the disadvantages listed in this proposal, whereas an additive change does not). Reserving lexical space to turn point changes into additive changes makes a lot of sense to me, but I agree that we don't need to do so now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I essentially agree with the point you're making here, for now at least.
addition, that in this instance occurs concurrently with the completion of | ||
the removal phase and the removal of the `upcoming` marker. | ||
|
||
### Guidance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structural comment, not necessary to address in this proposal, but from a BLUF writing perspective the guidance feels like the bottom-line of this proposal, and thus how it should begin, rather than at the tail end.
strategy. Therefore there is no requirement to reserve any lexical space to | ||
prepare for future changes. | ||
|
||
Therefore, we will no longer require whitespace after the `//` introducing a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think I may believe a little more than you that we should encourage reserving lexical space so that more future changes can be purely additive changes, even if we choose not to pursue them.
Right now it may not be worth guessing what changes we'll want to make -- Carbon is small, and if we added a token //*
probably nothing would be broken, so we wouldn't really make a tool. However, as Carbon grows, those costs shift -- I think reserving lexical space is going to be cheaper than writing and running migrations (note this is also a burden to users who need to update their code). Thus point changes actually have a more significant cost long-term, pushing more for reserving lexical space.
So yeah, right now, don't reserve //*
. If Carbon goes public and we still haven't really decided, reserve //*
.
In order to support changes to an interface, we allow newly-added methods to be | ||
marked as `upcoming`. This indicates that the method is not required, and indeed | ||
cannot be called (except by other `upcoming` functionality), but can be | ||
implemented. Then the addition of an interface method can be staged as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this about a change to the language, or a feature to allow evolution of user-defined interfaces? It feels like the document is mostly talking about the former, but this seems to be about the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My primary focus when writing this document was about evolving the language and its standard library, but my intent was to cover both that and the needs of people evolving non-leaf packages implemented in Carbon. That said, I'd expect that things that people evolving Carbon software need are also things that we need to evolve the standard library.
- A method is introduced, declared `upcoming`. This is an addition, as | ||
strictly more programs become valid. | ||
- The intent to remove the `upcoming` marker is announced -- in this case, | ||
implicitly, as all `upcoming` markers indicate an intent to remove the | ||
marker. The removal period for this `upcoming` marker begins. | ||
- Over time, the method is implemented by all implementers of the interface. | ||
- The `upcoming` marker is removed. This is a removal, as it results in | ||
strictly fewer programs being valid. | ||
- Once the removal is complete, the new method can be used. This is an | ||
addition, that in this instance occurs concurrently with the completion of | ||
the removal phase and the removal of the `upcoming` marker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't use a default implementation of the upcoming
method. With a default, the new function can be used with much less latency. This may be painting incremental changes in an unfair light.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a correct and generally-applicable default, the transition can be done as a point change, or perhaps even as a pure addition. I'm happy to switch to a different example; this one might be unhelpful by being similar to something we've been considering but with somewhat different details.
|
||
All subsequent builds using the new toolchain first migrate the source code to | ||
the new syntax, and then pass it to the new toolchain, which only understands | ||
the new syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you see this migration tool to be implemented and released in practice? It would need to be built on T-1 grammar and semantics. IIUC, the migration tool is separated from the compiler so that the compiler (and hence the grammar at time T) only needs to handle the new syntax. However, the compiler T needs to ship with a migration tool that understands T-1 syntax and semantics, and such handling is no longer available in current compiler libraries. So the migration tool can't use the T compiler as a library, it needs T-1.
It seems to me that either the migration tool will need to be built from a different branch than the compiler, or they can be built from the same source code, but with different feature flags enabled. I think branch-based development of the migrator will be a non-starter for the Carbon toolchain development process. If we use feature flags, we could "as well" for many migrations allow the new compiler to understand both kinds of syntax (package migration flags will determine whether the old or new syntax is actually accepted).
While this distinction might look like an implementation detail, I think it is user-visible, as it mitigates a number of disadvantages described above. Rust's editions are very similar to this flag-based model, for example, the RFC for Rust 2021 says:
- Editions are used to introduce changes into the language that would otherwise have the potential to break existing code, such as the introduction of a new keyword.
- Editions are never allowed to split the ecosystem. We only permit changes that still allow crates in different editions to interoperate.
- Editions are named after the year in which they occur (e.g., Rust 2015, Rust 2018, Rust 2021).
- When we release a new edition, we also release tooling to automate the migration of crates. Some manual work may be required but that should be uncommon.
- The nightly toolchain offers "preview" access to upcoming editions, so that we can land work that targets future editions at any time.
- We maintain an Edition Migration Guide that offers guidance on how to migrate to the next edition.
- Whenever possible, new features should be made to work across all editions.
Note that editions allow for removals following a deprecation cycle (see RFC 2052):
When opting in to a new edition, existing deprecations may turn into hard errors, and the compiler may take advantage of that fact to repurpose existing usage, e.g. by introducing a new keyword. This is the only kind of breaking change a edition opt-in can make.
In a different place Nico clarifies:
The language of the RFC was very clear that you should get warnings in the latest compiler release. This basically means that so long as we have the migration lints, we're ok. It's not require that the warnings are there for the entire edition or anything.
I think it is very similar to our goals and our migration strategy. The only real differences I'd propose:
- Carbon should name editions after the month of the release date (e.g., 2021.4) to allow for faster evolution.
- The Carbon toolchain will support the latest edition, and non-latest editions for at least for a certain amount of time (e.g., 6 months). Support for older editions will be dropped depending on the maintenance cost and user demand.
For migrating users of libraries over API changes we have the same issue. If libUiFramework releases v2 that requires a migration from v1, then to migrate a libCustomWidget we need libUiFramework v1 just to do semantic analysis of libCustomWidget before the migration, and v2 immediately after the migration to actually compile it. I think this is going to be difficult without a widely adopted package manager and build system; it might be more practical to see if all the necessary migration information can be included into just the libUiFramework v2.
In order to support changes to an interface, we allow newly-added methods to be | ||
marked as `upcoming`. This indicates that the method is not required, and indeed | ||
cannot be called (except by other `upcoming` functionality), but can be | ||
implemented. Then the addition of an interface method can be staged as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented. Then the addition of an interface method can be staged as follows: | |
implemented. Then the addition of an interface method for which no default implementation is possible can be staged as follows: |
We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the |
We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it. |
Proposal links (add links as proposal evolves):
[RFC topic](TODO)
[Decision topic](TODO)
[Decision PR](TODO)
[Announcement](TODO)
[Idea topic](TODO)
[TODO](TODO)