Skip to content

std.fmt.format — with generators #4059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 39 commits into from
Closed

Conversation

fengb
Copy link
Contributor

@fengb fengb commented Jan 3, 2020

Experiment to convert std.fmt.format into an iterator with userland generators. This pattern is a little rougher than I'd like, but it does work (as long as you don't hit async bugs). The biggest benefit here is the ability to seamlessly context dumping providing a nice iterator interface over the old errorset callback style.

var generator = Generator([]const u8).init();  // create the generator context
_ = async genFunction(&generator, args);  // prime the generator with async frame
while (generator.next()) |bytes| {
    consume(bytes);
}

Ideally, I would like to be able to return generators, but embedding both the async frame and the yield location seems to be impossible at the moment (requires @resultLocation()), and even then requiring generators be comptime generic removes a lot of composability.

TODO list:

Discovered a way to remove the manual recursion stack, thus these following problems no longer exist!

  • precalculate the recursive frame size, possibly leave extra buffer in case of complex — e.g. when customer formatters call other custom formatters ad infinitum
  • implement FixedBufferAllocator.free — need to enable freeing the last allocation for this recursion to work properly
  • better default output when giving up recursion — "??OOM??" looks silly

@fengb fengb force-pushed the format-generator branch from 1560e8a to 7e50a2e Compare January 3, 2020 16:14
@fengb
Copy link
Contributor Author

fengb commented Jan 6, 2020

I stumbled upon an elegant solution to the recursion problem. By applying comptime max_depth, I'm able to convince Zig to analyze all possible depths (or infinite loop trying...) and let me use "real" recursion in async functions. This is incredibly elegant since it should automatically account for the correct frame size, instead of me trying to figure out depth and manually shifting things onto a separate stack.

My only concern is that this can easily bloat the generated code, since every recursion step would potentially be its own function (now that max_depth is comptime).

@fengb fengb force-pushed the format-generator branch from 13c73e9 to bc47a89 Compare January 6, 2020 06:25
@fengb fengb force-pushed the format-generator branch from bc47a89 to 42a47eb Compare January 6, 2020 14:56
@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Jan 6, 2020
@fengb fengb force-pushed the format-generator branch from b07d6bb to 24e43df Compare January 7, 2020 16:31
@fengb
Copy link
Contributor Author

fengb commented Jan 14, 2020

The new atoi equivalent is ~16x more assembly instructions than existing code.

Perf tuning experiments over in https://github.com/fengb/generator-bench

Edit: https://godbolt.org/z/_Zve-- seems to indicate ~35 instructions to setup the generator and ~8 instructions per suspend/resume. All of the extra instructions generated by a simple suspend are optimized out with --single-threaded, but sadly LLVM still doesn't understand what to do with generator.yield.

Edit x5: Rust unstable generators have a very similar optimization problem: https://godbolt.org/z/pzsaCZ

@fengb
Copy link
Contributor Author

fengb commented Jan 15, 2020

Ran into a footgun. The generator currently checks after a resume to see whether the yielded value changes, and if not it's considered "done" (i.e. the function returned so that's why there is no yielded value). The only other way is if there's a suspend without a corresponding generator.yield, which I had assumed is a user error.

Unfortunately, there's a separate case that I hadn't considered: an actual async function. This is re-exposing the ugly color problem, and to solve it there needs to be some advanced heuristics to differentiate between the generator suspend and the async execution suspend.

fn iter(generator: *Generator([]const u8), intermission: anyframe->void) void {
    suspend generator.yield(@frame(), "Hello");

    // this is a non-yielding suspend
    _ = await intermission;

    suspend generator.yield(@frame(), "World");
}

If we continue with the current behavior, any await/suspend would also terminate the generator. This is kludgy and gross and exhibits spooky action at a distance (e.g. some function somewhere calls os.read() and the generator stops running), but it seems to be at least consistently reproducible.

If we want to allow async generators, I have a few ideas:

  1. next() will probably have to be always async, which might re-expose the seam in unwarranted places.
  2. Detecting frame liveness is most reliable, but it requires compiler support.
  3. Require a manual generator.terminate(). This would break the current style of sharing the generator context across the stack since somebody will have to own the exit.
  4. Manually refcount the function entrances/exits. This might be the least bad since every generator supported function can simply start with
    fn foo(generator: *Generator(T)) void {
        generator.acquire();
        defer generator.release();

@fengb
Copy link
Contributor Author

fengb commented Jan 31, 2020

Regression in this implementation — format() will no longer be usable at comptime:

const std = @import("std");

comptime {
    var buf: [100]u8 = undefined;
    @compileError(std.fmtgen.bufPrint(&buf, "hello {}", .{12}) catch unreachable);
}

We'll need either comptime async or real generators, which sounds like a lot more work.

@fengb
Copy link
Contributor Author

fengb commented Feb 10, 2020

Encountered too many design flaws. And this usecase is now obsoleted by #4404

@fengb fengb closed this Feb 10, 2020
@andrewrk
Copy link
Member

🌹 R.I.P. 🌹

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants