Skip to content

Change main from extern "C" to extern "Rust" #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

morr0ne
Copy link
Contributor

@morr0ne morr0ne commented Sep 3, 2023

Currently origin requires users of the library to supply a main function with the following definition

extern "C" {
    fn main(argc: c_int, argv: *mut *mut u8, envp: *mut *mut u8) -> c_int;
}

However it is not necessary to use an extern C there since this never crosses any ffi boundary and is always compiled from source. This pr changes this from an extern C to an extern Rust. This allows to omit any extern declarations when writing main since the Rust extern is always implied.

Main function before this change:

#[no_mangle]
extern "C" fn main(argc: i32, argv: *const *const u8) -> i32 {} 

Main function after this change

#[no_mangle] // No mangle is still required since we are still linking from an external library
fn main(argc: i32, argv: *const *const u8) -> i32 {} 

I had opened #33 at first without realizing I was working with a very outdated branch so this supersedes that

@sunfishcode
Copy link
Owner

sunfishcode commented Sep 4, 2023

Cool, I didn't know you could do that.

A concern I have is that this change would break mustang, because in mustang, origin calls an extern "C" main declaration and rustc provides a definition of extern "C" main.

Maybe the way to fix this is though to make mustang handle this. If we renamed origin's main declaration to something like origin_main, then mustang could defined an extern "Rust" origin_main which calls an extern "C" main, which would allow it to continue to work. Renaming main here might be a good idea regardless, to avoid confusion. So, would you mind renaming main to origin_main or something else here? Then I'll look into doing the mustang side of the fix.

Also, if we're using extern "Rust", it feels less justified to rely on the trick of leaving out the envp argument when we don't need it, so could you also add envp parameters to the various example crates so that the signatures exactly match?

Ooh, and this may also be an opportunity to replace argc: i32, argv: *const *const u8 with args: &'static [u8] args: &'static [*const u8]. What would you think about that?

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 4, 2023

Maybe the way to fix this is though to make mustang handle this. If we renamed origin's main declaration to something like origin_main, then mustang could defined an extern "Rust" origin_main which calls an extern "C" main, which would allow it to continue to work. Renaming main here might be a good idea regardless, to avoid confusion. So, would you mind renaming main to origin_main or something else here? Then I'll look into doing the mustang side of the fix.

I am not sure that is the best solution. I feel like origin should strive to be independent of mustang, that is already my use case and I feel like the better solution is the other way around and let mustang handle the change. If you could point me to where mustang relies on this I could form a better opinion but I feel like mustang should instead wrap in an extern "C" the extern "Rust" if we want to mantain compatibility

Ooh, and this may also be an opportunity to replace argc: i32, argv: *const *const u8 with args: &'static [u8]. What would you think about that?

That feels like the next logical step, after all we are abandoning libc here and are in pure rust territory, we should be free to make a safer api that better reflects the rust way of doing things.

Coming back to the topic of whether or not to rename main to origin_main I actually came up with what I think is an even better and more idiomatic solution. Should origin even handle main as an extern? I feel like a better solution would be to actually make origin more composable and allow to insert any function pointer as main. Ideally this could allow for the main function to be named anything and ultimately have a proc_macro like #[origin::main] that can be used to decorate a function to mark it as main and generate some glue code. This would also eliminate the issue of having to keep track of how main is called on other crates like mustang

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 4, 2023

Following on my previous comment, ideally origin would not handle main in the library directly but instead users of the library would be able to write code like the following:

#[panic_handler]
fn panic(_panic: &core::panic::PanicInfo<'_>) -> ! {
    core::intrinsics::abort()
}

#[origin::main]
fn main(args: &'static [u8]) -> i32 {
    0
}

Which would roughly expand to something like this:

#[panic_handler]
fn panic(_panic: &core::panic::PanicInfo<'_>) -> ! {
    core::intrinsics::abort()
}

fn main(args: &'static [u8]) -> i32 {
    0
}

#[naked]
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
    use core::arch::asm;

    fn entry(mem: *mut usize) -> ! {
        origin::_some_function_that_inits_the_runtime();
        let args = origin::_some_function_that_computes_args();
        let status = main(args);

        origin::exit(status)
    }

    #[cfg(target_arch = "x86_64")]
    asm!(
        "mov rdi, rsp", // Pass the incoming `rsp` as the arg to `entry`.
        "push rbp",     // Set the return address to zero.
        "jmp {entry}",  // Jump to `entry`.
        entry = sym entry,
        options(noreturn),
    );
}

This would allow normal programs to not need to look into origin internals and for more specif use cases like mustang, they can call the provided functions and manually handle the program initialization

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 4, 2023

Actually turns out we don't need to an extern "C" to jump to a function, so I update the code to reflect that. The only function that ever needs to use the C abi is _start since it's called directly by the os

@sunfishcode
Copy link
Owner

My instinct here is that handling main/origin_main as an extern is ok. This is a pretty low-level API that I expect users are not going to want to use directly very often. In particular, it gives you *const u8s rather than &strs or &OsStrs or &CStrs, and I don't think we want the default experience of sitting down to write a main function to involve raw pointers.

One option would be to go the other way, and make origin's main higher-level with arguments like (args: &'static [&'static CStr], envs: &'static [&'static CStr]), or maybe even &OsStr instead of &CStr, but I don't know how to do that without allocating, doing work, and/or pulling in dependencies that not all users will need.

Consequently, I expect the most common way to use this API will be via wrappers, whether that's mustang wrapping it up to produce a C-compatible main, origin-studio wrapping it up in its own macros, or other things. And as such, there isn't a strong motivation to bring in heavyweight features like proc macros to make this level more flexible.

@sunfishcode
Copy link
Owner

I can confirm that Mustang can do what it needs with a wrapper here. So that's fine.

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 5, 2023

My instinct here is that handling main/origin_main as an extern is ok. This is a pretty low-level API that I expect users are not going to want to use directly very often. In particular, it gives you *const u8s rather than &strs or &OsStrs or &CStrs, and I don't think we want the default experience of sitting down to write a main function to involve raw pointers.

One option would be to go the other way, and make origin's main higher-level with arguments like (args: &'static [&'static CStr], envs: &'static [&'static CStr]), or maybe even &OsStr instead of &CStr, but I don't know how to do that without allocating, doing work, and/or pulling in dependencies that not all users will need.

I think we might want to not pass args at all to the main function. I don't think there is any way to pass them in a "rusty" way without involving some complex logic and/or externa crates. After all isn't this what normal rust programs expect anyway? Imo the best way to handle args is the way std does by providing an iterator api, which in this case could be handled by origin-stdio instead of origin itself

Consequently, I expect the most common way to use this API will be via wrappers, whether that's mustang wrapping it up to produce a C-compatible main, origin-studio wrapping it up in its own macros, or other things. And as such, there isn't a strong motivation to bring in heavyweight features like proc macros to make this level more flexible.

If we truly wanna make origin more flexible then it shouldn't handle main at all, instead it should just export the appropriate function to let other crates like mustang and origin-stdio handle the initialization. I still think there is value in allowing for origin-only programs and a proc_macro would fulfill that role, it would live in a separate crate called origin-macros which would be an optional dependency disabled by default.

By the way, sorry if this comment is somewhat strange but github is experiencing some kind of outage and I've spent the better part of an hour trying to write this comment.

@sunfishcode
Copy link
Owner

The patch here is now merged as part of #39, and the discussion is continuing in #36 which remains open, so I'll close this PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants