-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
langref: improve packed struct memory layout description #21741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have a suggestion and a question, but this is an improvement without addressing either of those.
I found an article about MMIO in zig: https://www.scattered-thoughts.net/writing/mmio-in-zig/ However, this was written in 2021, and integer-backed packed structs landed in 2022: #5049 (comment) I will attempt to tailor the information into some recommendation for MMIO. Right now, I am thinking:
But I won't be able to make a doc-test that actually uses |
MMIO example needs close review from someone familiar with this use-case. It is inspired by examples in the microzig framework: |
Looks even better to me now! I like the MMIO text and example, and while it makes sense to me, I don't have any particular experience with them to validate it. In the MMIO example, I'm torn between wanting to see more documentation about the |
Most CPUs (except rare old old ones) cannot individually change bits in memory. They can usually only read / write from the smallest addressable unit of memory, which is usually 1 byte. In this example, a micro controller would typically have a special memory address that can be written to manipulate GPIO (general purpose input and output). This example assumes the user has read the microcontroller datasheet, found an address of a register ( The thing to be careful about is not letting the reads / writes from memory to be optimized away (hence the In the example, the entire byte at address |
Technically I could remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a good job of clarifying some aspects of packed structs, but there are a few parts I think should be trimmed down, they're more appropriate to a tutorial than a langref.
doc/langref.html.in
Outdated
<li>The backing integer has the same bit width as the fields' total bit width.</li> | ||
<li>The backing integer is subject to the same rules as any integer, including {#link|alignment|Alignment#}, | ||
having a maximum bit count of 65535, and the host endianness. | ||
On a big endian system, the first declared field will have the highest memory address, and on a little endian system, the lowest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this part of the documentation is a good place to document how endianness works. This isn't really a documentation of what packed structs are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the harm in having this information here.
It also isn't just documentation "how endianness works", but how it interacts with struct layout here. The language used to be specified/implemented differently in this regard, and this point is often asked about (see inciting issue #20647 - andrew didn't close it, but reflagged it from bug to documentation issue ).
Ultimately the wording and scope is up to andrew to decide. I don't think trimming it now (on anyone else's opinion) is helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GitHub comment system makes this unclear because of the context window: I was only referring to line 2188 here. Endianness is not really part of the user interface of packed structs, it's a fact about the backing integers.
The point of review is to come up with a better patch, so I disagree that it isn't helpful to do it. However, given the lack of consensus on this point, I think it's fine to err on the side of leaving it in.
doc/langref.html.in
Outdated
</li> | ||
<li>{#syntax#}bool{#endsyntax#} fields use exactly 1 bit.</li> | ||
<li>Due to {#link|alignment|Alignment#}, the backing integer may require more memory that its bit width. {#link|@bitSizeOf|@bitSizeOf#} and {#link|@sizeOf|@sizeOf#} can interrogate the difference.</li> | ||
<li>Field access and assignment can be understood as shorthand for bitshifts on the backing integer.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be left off. The compiler has many options for implementing packed struct operations, many don't involve bitshifting at all.
How packed structs are translated into machine code is an implementation detail which doesn't need to be analogized here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is most relevant to the MMIO case, perhaps I could move it down there? See recent comments on this PR about how cpus cannot change individual bits in memory without a load/store of complete bytes.
I tried to keep it vague (using "can be understood as") because it makes sense that the compiler could optimize field assignments inside very large packed structs to only affect the changed memory, as you said, and because I do not know the exact specifics of the current or expected behavior when volatile is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I see is that it isn't a complete description. For instance pack.field3 = true
might be implemented as backing_int | 0b0100
, that's a mask, not a shift. The documentation is a user guide: a complete description of the implementation of packed struct operations would be inappropriate, so I don't think a partial analogy is correct here either.
Compare to the integer literal documentation. It does say that signed values are always two's complement, that's useful to know: the C standard, for instance, does not (or didn't until quite recently) specify an integer representation. But it doesn't say anything about how the Zig compiler implements large integers like u256
, because while that's interesting, it isn't user documentation.
As for moving the idea to the MMIO case: yeah that makes perfect sense to me. A comment over a specific operation in the sample code saying "this is equivalent to {stuff}" which shows a mask, bitshift, whatever is correct, that's helpful, because it can be completely correct for the commented line, and illustrates the connection between packed struct field access and the manual-transmission way of doing these things in other low-level languages. So I'd say that's a good place for it.
By saying it's equivalent, it's also not a statement about what the implementation happens to do. Keeping implementation details out of the documentation is important, because it's no longer a detail if it's documented.
doc/langref.html.in
Outdated
(the use case for volatile packed structs) once this issue is resolved. | ||
Don't worry, there will be a good solution for this use case in zig. | ||
Packed structs can be used to interact with memory-mapped input-output (MMIO), which is | ||
common in embedded applications. A pointer of the correct alignment and address to a packed struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having an illustration of MMIO is on balance good, but there's too much detail here. I'd cut this to:
Packed structs can be used to interact with memory-mapped input-output (MMIO):
Followed by the illustration code. If you think the detail about the pointer alignment and address is important for understanding the code snippet (perhaps) it could be a comment in the code snippet.
For me to provide a truly adequate MMIO example (to satisfy the TODO), I need to know the following:
|
Regarding impending love to the PR queue: As it stands right now, I think this PR is a strict improvement to the status quo and is mergable. If a core team member has feedback I am happy to implement it. As far as I know, most user code is careful to read/write the entire packed struct and not individual fields. For example: https://github.com/ZigEmbeddedGroup/microzig/blob/cf666d6137b85dfa5353b275c1b46b421f646f3f/core/src/mmio.zig#L47 |
853a66d
to
2c30e85
Compare
2c30e85
to
bd38c41
Compare
intended to close
#14384
#20647
Questions:
1. I don't know what to do about this TODO:> TODO update these docs with a recommendation on how to use packed structs with MMIO> (the use case for volatile packed structs) once this issue is resolved.> Don't worry, there will be a good solution for this use case in zig.Looks like this: