-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
beginnings of (non-LLVM) self-hosted machine code generation and linking #5158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
also coerce no longer requires a bitcast
src-self-hosted/link.zig
Outdated
const fs = std.fs; | ||
const elf = std.elf; | ||
|
||
const executable_mode = 0o755; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const executable_mode = 0o755; | |
const executable_mode = 0o777; |
On common systems with a 022
umask, this will still result in a file created with 755
permissions, but it works appropriately if the system is configured more leniently. (As another data point, C's fopen
seems to open files with the 666
mode.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking what GCC and Clang do under strace
, they seem to
- create the file with
666
permissions, - use
stat
to determine the actual resulting permissions (modified by the umask) - use
umask
twice, first to get the current umask and then to reset it (there isn't a side-effect-free way to get the umask that I know of) - use
chmod
to add the executable bit, manually masking out the umask
As a specific example, running (umask 124 && strace -ff clang test.c)
produced
[pid 24421] openat(AT_FDCWD, "a.out", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
[pid 24421] stat("a.out", {st_mode=S_IFREG|0642, st_size=16312, ...}) = 0
[pid 24421] umask(000) = 0124
[pid 24421] umask(0124) = 000
[pid 24421] chmod("a.out", 0653) = 0
(among other lines, of course). GCC is similar. Opening with 777
permissions from the start would produce the same result in a much simpler way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I committed your suggestion along with your explanation as a comment.
lib/std/mem.zig
Outdated
/// Round an address up to the previous aligned address | ||
/// The alignment must be a power of 2 and greater than 0. | ||
pub fn alignBackwardGeneric(comptime T: type, addr: T, alignment: T) T { | ||
assert(@popCount(T, alignment) == 1); | ||
// 000010000 // example addr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// 000010000 // example addr | |
// 000010000 // example alignment |
I know this isn't actually changed, but while you are here and it appears in context, alignment
is the value that is shifted and inverted. (As a side note, negation is equivalent: in two's complement, -x = ~(x - 1)
.)
Jonathan S writes: On common systems with a 022 umask, this will still result in a file created with 755 permissions, but it works appropriately if the system is configured more leniently. (As another data point, C's fopen seems to open files with the 666 mode.)
This is the beginning of a self-hosted incremental linker (#1535) as well as pure Zig backend.
Here's a demo:
hello.zir
Now run the example program:
This example code renders back to ZIR after semantic analysis is complete, and also creates
a.out
:Here you can see the "Hello, World!" string is embedded directly into the function. That's not necessarily how it is going to work in the future.
A plan for incremental recompilation
I do have a plan for incremental compilation now (only applicable to debug mode):
The idea is that every top level declaration in zig maps to an ELF symbol. All functions are generated with Position Independent Code. Part of what the cache stores is a mapping from zig top level decl to the decls that depend on it directly.
When a file is detected as changed, we then re-parse the AST of that file looking for AST changes per decl. Now we have a set of decls that is changed. Follow the dependency tree, regenerating decls.
Now we have a set of regenerated decls, and each one maps directly to a symbol in the ELF file. So we only have to go modify symbols in the ELF file that correspond to changed decls.
Next steps