@@ -7928,13 +7928,261 @@ pub fn main() void {
7928
7928
7929
7929
{#header_close#}
7930
7930
{#header_open|Memory#}
7931
- <p>TODO: explain no default allocator in zig</p>
7932
- <p>TODO: show how to use the allocator interface</p>
7933
- <p>TODO: mention debug allocator</p>
7934
- <p>TODO: importance of checking for allocation failure</p>
7935
- <p>TODO: mention overcommit and the OOM Killer</p>
7936
- <p>TODO: mention recursion</p>
7937
- {#see_also|Pointers#}
7931
+ <p>
7932
+ The Zig language performs no memory management on behalf of the programmer. This is
7933
+ why Zig has no runtime, and why Zig code works seamlessly in so many environments,
7934
+ including real-time software, operating system kernels, embedded devices, and
7935
+ low latency servers. As a consequence, Zig programmers must always be able to answer
7936
+ the question:
7937
+ </p>
7938
+ <p>{#link|Where are the bytes?#}</p>
7939
+ <p>
7940
+ Like Zig, the C programming language has manual memory management. However, unlike Zig,
7941
+ C has a default allocator - <code>malloc</code>, <code>realloc</code>, and <code>free</code>.
7942
+ When linking against libc, Zig exposes this allocator with {#syntax#}std.heap.c_allocator{#endsyntax#}.
7943
+ However, by convention, there is no default allocator in Zig. Instead, functions which need to
7944
+ allocate accept an {#syntax#}*Allocator{#endsyntax#} parameter. Likewise, data structures such as
7945
+ {#syntax#}std.ArrayList{#endsyntax#} accept an {#syntax#}*Allocator{#endsyntax#} parameter in
7946
+ their initialization functions:
7947
+ </p>
7948
+ {#code_begin|test|allocator#}
7949
+ const std = @import("std");
7950
+ const Allocator = std.mem.Allocator;
7951
+ const assert = std.debug.assert;
7952
+
7953
+ test "using an allocator" {
7954
+ var buffer: [100]u8 = undefined;
7955
+ const allocator = &std.heap.FixedBufferAllocator.init(&buffer).allocator;
7956
+ const result = try concat(allocator, "foo", "bar");
7957
+ assert(std.mem.eql(u8, "foobar", result));
7958
+ }
7959
+
7960
+ fn concat(allocator: *Allocator, a: []const u8, b: []const u8) ![]u8 {
7961
+ const result = try allocator.alloc(u8, a.len + b.len);
7962
+ std.mem.copy(u8, result, a);
7963
+ std.mem.copy(u8, result[a.len..], b);
7964
+ return result;
7965
+ }
7966
+ {#code_end#}
7967
+ <p>
7968
+ In the above example, 100 bytes of stack memory are used to initialize a
7969
+ {#syntax#}FixedBufferAllocator{#endsyntax#}, which is then passed to a function.
7970
+ As a convenience there is a global {#syntax#}FixedBufferAllocator{#endsyntax#}
7971
+ available for quick tests at {#syntax#}std.debug.global_allocator{#endsyntax#},
7972
+ however it is deprecated and should be avoided in favor of directly using a
7973
+ {#syntax#}FixedBufferAllocator{#endsyntax#} as in the example above.
7974
+ </p>
7975
+ <p>
7976
+ Currently Zig has no general purpose allocator, but there is
7977
+ <a href="https://github.com/andrewrk/zig-general-purpose-allocator/">one under active development</a>.
7978
+ Once it is merged into the Zig standard library it will become available to import
7979
+ with {#syntax#}std.heap.default_allocator{#endsyntax#}. However, it will still be recommended to
7980
+ follow the {#link|Choosing an Allocator#} guide.
7981
+ </p>
7982
+
7983
+ {#header_open|Choosing an Allocator#}
7984
+ <p>What allocator to use depends on a number of factors. Here is a flow chart to help you decide:
7985
+ </p>
7986
+ <ol>
7987
+ <li>
7988
+ Are you making a library? In this case, best to accept an {#syntax#}*Allocator{#endsyntax#}
7989
+ as a parameter and allow your library's users to decide what allocator to use.
7990
+ </li>
7991
+ <li>Are you linking libc? In this case, {#syntax#}std.heap.c_allocator{#endsyntax#} is likely
7992
+ the right choice, at least for your main allocator.</li>
7993
+ <li>
7994
+ Is the maximum number of bytes that you will need bounded by a number known at
7995
+ {#link|comptime#}? In this case, use {#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} or
7996
+ {#syntax#}std.heap.ThreadSafeFixedBufferAllocator{#endsyntax#} depending on whether you need
7997
+ thread-safety or not.
7998
+ </li>
7999
+ <li>
8000
+ Is your program a command line application which runs from start to end without any fundamental
8001
+ cyclical pattern (such as a video game main loop, or a web server request handler),
8002
+ such that it would make sense to free everything at once at the end?
8003
+ In this case, it is recommended to follow this pattern:
8004
+ {#code_begin|exe|cli_allocation#}
8005
+ const std = @import("std");
8006
+
8007
+ pub fn main() !void {
8008
+ var direct_allocator = std.heap.DirectAllocator.init();
8009
+ defer direct_allocator.deinit();
8010
+
8011
+ var arena = std.heap.ArenaAllocator.init(&direct_allocator.allocator);
8012
+ defer arena.deinit();
8013
+
8014
+ const allocator = &arena.allocator;
8015
+
8016
+ const ptr = try allocator.create(i32);
8017
+ std.debug.warn("ptr={*}\n", ptr);
8018
+ }
8019
+ {#code_end#}
8020
+ When using this kind of allocator, there is no need to free anything manually. Everything
8021
+ gets freed at once with the call to {#syntax#}arena.deinit(){#endsyntax#}.
8022
+ </li>
8023
+ <li>
8024
+ Are the allocations part of a cyclical pattern such as a video game main loop, or a web
8025
+ server request handler? If the allocations can all be freed at once, at the end of the cycle,
8026
+ for example once the video game frame has been fully rendered, or the web server request has
8027
+ been served, then {#syntax#}std.heap.ArenaAllocator{#endsyntax#} is a great candidate. As
8028
+ demonstrated in the previous bullet point, this allows you to free entire arenas at once.
8029
+ Note also that if an upper bound of memory can be established, then
8030
+ {#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} can be used as a further optimization.
8031
+ </li>
8032
+ <li>
8033
+ Are you writing a test, and you want to make sure {#syntax#}error.OutOfMemory{#endsyntax#}
8034
+ is handled correctly? In this case, use {#syntax#}std.debug.FailingAllocator{#endsyntax#}.
8035
+ </li>
8036
+ <li>
8037
+ Finally, if none of the above apply, you need a general purpose allocator. Zig does not
8038
+ yet have a general purpose allocator in the standard library,
8039
+ <a href="https://github.com/andrewrk/zig-general-purpose-allocator/">but one is being actively developed</a>.
8040
+ You can also consider {#link|Implementing an Allocator#}.
8041
+ </li>
8042
+ </ol>
8043
+ {#header_close#}
8044
+
8045
+ {#header_open|Where are the bytes?#}
8046
+ <p>String literals such as {#syntax#}"foo"{#endsyntax#} are in the global constant data section.
8047
+ This is why it is an error to pass a string literal to a mutable slice, like this:
8048
+ </p>
8049
+ {#code_begin|test_err|expected type '[]u8'#}
8050
+ fn foo(s: []u8) void {}
8051
+
8052
+ test "string literal to mutable slice" {
8053
+ foo("hello");
8054
+ }
8055
+ {#code_end#}
8056
+ <p>However if you make the slice constant, then it works:</p>
8057
+ {#code_begin|test|strlit#}
8058
+ fn foo(s: []const u8) void {}
8059
+
8060
+ test "string literal to constant slice" {
8061
+ foo("hello");
8062
+ }
8063
+ {#code_end#}
8064
+ <p>
8065
+ Just like string literals, `const` declarations, when the value is known at {#link|comptime#},
8066
+ are stored in the global constant data section. Also {#link|Compile Time Variables#} are stored
8067
+ in the global constant data section.
8068
+ </p>
8069
+ <p>
8070
+ `var` declarations inside functions are stored in the function's stack frame. Once a function returns,
8071
+ any {#link|Pointers#} to variables in the function's stack frame become invalid references, and
8072
+ dereferencing them becomes unchecked {#link|Undefined Behavior#}.
8073
+ </p>
8074
+ <p>
8075
+ `var` declarations at the top level or in {#link|struct#} declarations are stored in the global
8076
+ data section.
8077
+ </p>
8078
+ <p>
8079
+ The location of memory allocated with {#syntax#}allocator.alloc{#endsyntax#} or
8080
+ {#syntax#}allocator.create{#endsyntax#} is determined by the allocator's implementation.
8081
+ </p>
8082
+ </p>TODO: thread local variables</p>
8083
+ {#header_close#}
8084
+
8085
+ {#header_open|Implementing an Allocator#}
8086
+ <p>Zig programmers can implement their own allocators by fulfilling the Allocator interface.
8087
+ In order to do this one must read carefully the documentation comments in std/mem.zig and
8088
+ then supply a {#syntax#}reallocFn{#endsyntax#} and a {#syntax#}shrinkFn{#endsyntax#}.
8089
+ </p>
8090
+ <p>
8091
+ There are many example allocators to look at for inspiration. Look at std/heap.zig and
8092
+ at this
8093
+ <a href="https://github.com/andrewrk/zig-general-purpose-allocator/">work-in-progress general purpose allocator</a>.
8094
+ TODO: once <a href="https://github.com/ziglang/zig/issues/21">#21</a> is done, link to the docs
8095
+ here.
8096
+ </p>
8097
+ {#header_close#}
8098
+
8099
+ {#header_open|Heap Allocation Failure#}
8100
+ <p>
8101
+ Many programming languages choose to handle the possibility of heap allocation failure by
8102
+ unconditionally crashing. By convention, Zig programmers do not consider this to be a
8103
+ satisfactory solution. Instead, {#syntax#}error.OutOfMemory{#endsyntax#} represents
8104
+ heap allocation failure, and Zig libraries return this error code whenever heap allocation
8105
+ failure prevented an operation from completing successfully.
8106
+ </p>
8107
+ <p>
8108
+ Some have argued that because some operating systems such as Linux have memory overcommit enabled by
8109
+ default, it is pointless to handle heap allocation failure. There are many problems with this reasoning:
8110
+ </p>
8111
+ <ul>
8112
+ <li>Only some operating systems have an overcommit feature.
8113
+ <ul>
8114
+ <li>Linux has it enabled by default, but it is configurable.</li>
8115
+ <li>Windows does not overcommit.</li>
8116
+ <li>Embedded systems do not have overcommit.</li>
8117
+ <li>Hobby operating systems may or may not have overcommit.</li>
8118
+ </ul>
8119
+ </li>
8120
+ <li>
8121
+ For real-time systems, not only is there no overcommit, but typically the maximum amount
8122
+ of memory per application is determined ahead of time.
8123
+ </li>
8124
+ <li>
8125
+ When writing a library, one of the main goals is code reuse. By making code handle
8126
+ allocation failure correctly, a library becomes eligible to be reused in
8127
+ more contexts.
8128
+ </li>
8129
+ <li>
8130
+ Although some software has grown to depend on overcommit being enabled, its existence
8131
+ is the source of countless user experience disasters. When a system with overcommit enabled,
8132
+ such as Linux on default settings, comes close to memory exhaustion, the system locks up
8133
+ and becomes unusable. At this point, the OOM Killer selects an application to kill
8134
+ based on heuristics. This non-deterministic decision often results in an important process
8135
+ being killed, and often fails to return the system back to working order.
8136
+ </li>
8137
+ </ul>
8138
+ {#header_close#}
8139
+
8140
+ {#header_open|Recursion#}
8141
+ <p>
8142
+ Recursion is a fundamental tool in modeling software. However it has an often-overlooked problem:
8143
+ unbounded memory allocation.
8144
+ </p>
8145
+ <p>
8146
+ Recursion is an area of active experimentation in Zig and so the documentation here is not final.
8147
+ You can read a
8148
+ <a href="https://ziglang.org/download/0.3.0/release-notes.html#recursion">summary of recursion status in the 0.3.0 release notes</a>.
8149
+ </p>
8150
+ <p>
8151
+ The short summary is that currently recursion works normally as you would expect. Although Zig code
8152
+ is not yet protected from stack overflow, it is planned that a future version of Zig will provide
8153
+ such protection, with some degree of cooperation from Zig code required.
8154
+ </p>
8155
+ {#header_close#}
8156
+
8157
+ {#header_open|Lifetime and Ownership#}
8158
+ <p>
8159
+ It is the Zig programmer's responsibility to ensure that a {#link|pointer|Pointers#} is not
8160
+ accessed when the memory pointed to is no longer available. Note that a {#link|slice|Slices#}
8161
+ is a form of pointer, in that it references other memory.
8162
+ </p>
8163
+ <p>
8164
+ In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers.
8165
+ In general, when a function returns a pointer, the documentation for the function should explain
8166
+ who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever,
8167
+ to free the pointer.
8168
+ </p>
8169
+ <p>
8170
+ For example, the function's documentation may say "caller owns the returned memory", in which case
8171
+ the code that calls the function must have a plan for when to free that memory. Probably in this situation,
8172
+ the function will accept an {#syntax#}*Allocator{#endsyntax#} parameter.
8173
+ </p>
8174
+ <p>
8175
+ Sometimes the lifetime of a pointer may be more complicated. For example, when using
8176
+ {#syntax#}std.ArrayList(T).toSlice(){#endsyntax#}, the returned slice has a lifetime that remains
8177
+ valid until the next time the list is resized, such as by appending new elements.
8178
+ </p>
8179
+ <p>
8180
+ The API documentation for functions and data structures should take great care to explain
8181
+ the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it
8182
+ is to free the memory referenced by the pointer, and lifetime determines the point at which
8183
+ the memory becomes inaccessible (lest {#link|Undefined Behavior#} occur).
8184
+ </p>
8185
+ {#header_close#}
7938
8186
7939
8187
{#header_close#}
7940
8188
{#header_open|Compile Variables#}
0 commit comments