Skip to content

Commit 567175f

Browse files
committed
add documentation for Memory
closes #1904
1 parent 1b801bd commit 567175f

File tree

1 file changed

+255
-7
lines changed

1 file changed

+255
-7
lines changed

doc/langref.html.in

Lines changed: 255 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7928,13 +7928,261 @@ pub fn main() void {
79287928

79297929
{#header_close#}
79307930
{#header_open|Memory#}
7931-
<p>TODO: explain no default allocator in zig</p>
7932-
<p>TODO: show how to use the allocator interface</p>
7933-
<p>TODO: mention debug allocator</p>
7934-
<p>TODO: importance of checking for allocation failure</p>
7935-
<p>TODO: mention overcommit and the OOM Killer</p>
7936-
<p>TODO: mention recursion</p>
7937-
{#see_also|Pointers#}
7931+
<p>
7932+
The Zig language performs no memory management on behalf of the programmer. This is
7933+
why Zig has no runtime, and why Zig code works seamlessly in so many environments,
7934+
including real-time software, operating system kernels, embedded devices, and
7935+
low latency servers. As a consequence, Zig programmers must always be able to answer
7936+
the question:
7937+
</p>
7938+
<p>{#link|Where are the bytes?#}</p>
7939+
<p>
7940+
Like Zig, the C programming language has manual memory management. However, unlike Zig,
7941+
C has a default allocator - <code>malloc</code>, <code>realloc</code>, and <code>free</code>.
7942+
When linking against libc, Zig exposes this allocator with {#syntax#}std.heap.c_allocator{#endsyntax#}.
7943+
However, by convention, there is no default allocator in Zig. Instead, functions which need to
7944+
allocate accept an {#syntax#}*Allocator{#endsyntax#} parameter. Likewise, data structures such as
7945+
{#syntax#}std.ArrayList{#endsyntax#} accept an {#syntax#}*Allocator{#endsyntax#} parameter in
7946+
their initialization functions:
7947+
</p>
7948+
{#code_begin|test|allocator#}
7949+
const std = @import("std");
7950+
const Allocator = std.mem.Allocator;
7951+
const assert = std.debug.assert;
7952+
7953+
test "using an allocator" {
7954+
var buffer: [100]u8 = undefined;
7955+
const allocator = &std.heap.FixedBufferAllocator.init(&buffer).allocator;
7956+
const result = try concat(allocator, "foo", "bar");
7957+
assert(std.mem.eql(u8, "foobar", result));
7958+
}
7959+
7960+
fn concat(allocator: *Allocator, a: []const u8, b: []const u8) ![]u8 {
7961+
const result = try allocator.alloc(u8, a.len + b.len);
7962+
std.mem.copy(u8, result, a);
7963+
std.mem.copy(u8, result[a.len..], b);
7964+
return result;
7965+
}
7966+
{#code_end#}
7967+
<p>
7968+
In the above example, 100 bytes of stack memory are used to initialize a
7969+
{#syntax#}FixedBufferAllocator{#endsyntax#}, which is then passed to a function.
7970+
As a convenience there is a global {#syntax#}FixedBufferAllocator{#endsyntax#}
7971+
available for quick tests at {#syntax#}std.debug.global_allocator{#endsyntax#},
7972+
however it is deprecated and should be avoided in favor of directly using a
7973+
{#syntax#}FixedBufferAllocator{#endsyntax#} as in the example above.
7974+
</p>
7975+
<p>
7976+
Currently Zig has no general purpose allocator, but there is
7977+
<a href="https://github.com/andrewrk/zig-general-purpose-allocator/">one under active development</a>.
7978+
Once it is merged into the Zig standard library it will become available to import
7979+
with {#syntax#}std.heap.default_allocator{#endsyntax#}. However, it will still be recommended to
7980+
follow the {#link|Choosing an Allocator#} guide.
7981+
</p>
7982+
7983+
{#header_open|Choosing an Allocator#}
7984+
<p>What allocator to use depends on a number of factors. Here is a flow chart to help you decide:
7985+
</p>
7986+
<ol>
7987+
<li>
7988+
Are you making a library? In this case, best to accept an {#syntax#}*Allocator{#endsyntax#}
7989+
as a parameter and allow your library's users to decide what allocator to use.
7990+
</li>
7991+
<li>Are you linking libc? In this case, {#syntax#}std.heap.c_allocator{#endsyntax#} is likely
7992+
the right choice, at least for your main allocator.</li>
7993+
<li>
7994+
Is the maximum number of bytes that you will need bounded by a number known at
7995+
{#link|comptime#}? In this case, use {#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} or
7996+
{#syntax#}std.heap.ThreadSafeFixedBufferAllocator{#endsyntax#} depending on whether you need
7997+
thread-safety or not.
7998+
</li>
7999+
<li>
8000+
Is your program a command line application which runs from start to end without any fundamental
8001+
cyclical pattern (such as a video game main loop, or a web server request handler),
8002+
such that it would make sense to free everything at once at the end?
8003+
In this case, it is recommended to follow this pattern:
8004+
{#code_begin|exe|cli_allocation#}
8005+
const std = @import("std");
8006+
8007+
pub fn main() !void {
8008+
var direct_allocator = std.heap.DirectAllocator.init();
8009+
defer direct_allocator.deinit();
8010+
8011+
var arena = std.heap.ArenaAllocator.init(&direct_allocator.allocator);
8012+
defer arena.deinit();
8013+
8014+
const allocator = &arena.allocator;
8015+
8016+
const ptr = try allocator.create(i32);
8017+
std.debug.warn("ptr={*}\n", ptr);
8018+
}
8019+
{#code_end#}
8020+
When using this kind of allocator, there is no need to free anything manually. Everything
8021+
gets freed at once with the call to {#syntax#}arena.deinit(){#endsyntax#}.
8022+
</li>
8023+
<li>
8024+
Are the allocations part of a cyclical pattern such as a video game main loop, or a web
8025+
server request handler? If the allocations can all be freed at once, at the end of the cycle,
8026+
for example once the video game frame has been fully rendered, or the web server request has
8027+
been served, then {#syntax#}std.heap.ArenaAllocator{#endsyntax#} is a great candidate. As
8028+
demonstrated in the previous bullet point, this allows you to free entire arenas at once.
8029+
Note also that if an upper bound of memory can be established, then
8030+
{#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} can be used as a further optimization.
8031+
</li>
8032+
<li>
8033+
Are you writing a test, and you want to make sure {#syntax#}error.OutOfMemory{#endsyntax#}
8034+
is handled correctly? In this case, use {#syntax#}std.debug.FailingAllocator{#endsyntax#}.
8035+
</li>
8036+
<li>
8037+
Finally, if none of the above apply, you need a general purpose allocator. Zig does not
8038+
yet have a general purpose allocator in the standard library,
8039+
<a href="https://github.com/andrewrk/zig-general-purpose-allocator/">but one is being actively developed</a>.
8040+
You can also consider {#link|Implementing an Allocator#}.
8041+
</li>
8042+
</ol>
8043+
{#header_close#}
8044+
8045+
{#header_open|Where are the bytes?#}
8046+
<p>String literals such as {#syntax#}"foo"{#endsyntax#} are in the global constant data section.
8047+
This is why it is an error to pass a string literal to a mutable slice, like this:
8048+
</p>
8049+
{#code_begin|test_err|expected type '[]u8'#}
8050+
fn foo(s: []u8) void {}
8051+
8052+
test "string literal to mutable slice" {
8053+
foo("hello");
8054+
}
8055+
{#code_end#}
8056+
<p>However if you make the slice constant, then it works:</p>
8057+
{#code_begin|test|strlit#}
8058+
fn foo(s: []const u8) void {}
8059+
8060+
test "string literal to constant slice" {
8061+
foo("hello");
8062+
}
8063+
{#code_end#}
8064+
<p>
8065+
Just like string literals, `const` declarations, when the value is known at {#link|comptime#},
8066+
are stored in the global constant data section. Also {#link|Compile Time Variables#} are stored
8067+
in the global constant data section.
8068+
</p>
8069+
<p>
8070+
`var` declarations inside functions are stored in the function's stack frame. Once a function returns,
8071+
any {#link|Pointers#} to variables in the function's stack frame become invalid references, and
8072+
dereferencing them becomes unchecked {#link|Undefined Behavior#}.
8073+
</p>
8074+
<p>
8075+
`var` declarations at the top level or in {#link|struct#} declarations are stored in the global
8076+
data section.
8077+
</p>
8078+
<p>
8079+
The location of memory allocated with {#syntax#}allocator.alloc{#endsyntax#} or
8080+
{#syntax#}allocator.create{#endsyntax#} is determined by the allocator's implementation.
8081+
</p>
8082+
</p>TODO: thread local variables</p>
8083+
{#header_close#}
8084+
8085+
{#header_open|Implementing an Allocator#}
8086+
<p>Zig programmers can implement their own allocators by fulfilling the Allocator interface.
8087+
In order to do this one must read carefully the documentation comments in std/mem.zig and
8088+
then supply a {#syntax#}reallocFn{#endsyntax#} and a {#syntax#}shrinkFn{#endsyntax#}.
8089+
</p>
8090+
<p>
8091+
There are many example allocators to look at for inspiration. Look at std/heap.zig and
8092+
at this
8093+
<a href="https://github.com/andrewrk/zig-general-purpose-allocator/">work-in-progress general purpose allocator</a>.
8094+
TODO: once <a href="https://github.com/ziglang/zig/issues/21">#21</a> is done, link to the docs
8095+
here.
8096+
</p>
8097+
{#header_close#}
8098+
8099+
{#header_open|Heap Allocation Failure#}
8100+
<p>
8101+
Many programming languages choose to handle the possibility of heap allocation failure by
8102+
unconditionally crashing. By convention, Zig programmers do not consider this to be a
8103+
satisfactory solution. Instead, {#syntax#}error.OutOfMemory{#endsyntax#} represents
8104+
heap allocation failure, and Zig libraries return this error code whenever heap allocation
8105+
failure prevented an operation from completing successfully.
8106+
</p>
8107+
<p>
8108+
Some have argued that because some operating systems such as Linux have memory overcommit enabled by
8109+
default, it is pointless to handle heap allocation failure. There are many problems with this reasoning:
8110+
</p>
8111+
<ul>
8112+
<li>Only some operating systems have an overcommit feature.
8113+
<ul>
8114+
<li>Linux has it enabled by default, but it is configurable.</li>
8115+
<li>Windows does not overcommit.</li>
8116+
<li>Embedded systems do not have overcommit.</li>
8117+
<li>Hobby operating systems may or may not have overcommit.</li>
8118+
</ul>
8119+
</li>
8120+
<li>
8121+
For real-time systems, not only is there no overcommit, but typically the maximum amount
8122+
of memory per application is determined ahead of time.
8123+
</li>
8124+
<li>
8125+
When writing a library, one of the main goals is code reuse. By making code handle
8126+
allocation failure correctly, a library becomes eligible to be reused in
8127+
more contexts.
8128+
</li>
8129+
<li>
8130+
Although some software has grown to depend on overcommit being enabled, its existence
8131+
is the source of countless user experience disasters. When a system with overcommit enabled,
8132+
such as Linux on default settings, comes close to memory exhaustion, the system locks up
8133+
and becomes unusable. At this point, the OOM Killer selects an application to kill
8134+
based on heuristics. This non-deterministic decision often results in an important process
8135+
being killed, and often fails to return the system back to working order.
8136+
</li>
8137+
</ul>
8138+
{#header_close#}
8139+
8140+
{#header_open|Recursion#}
8141+
<p>
8142+
Recursion is a fundamental tool in modeling software. However it has an often-overlooked problem:
8143+
unbounded memory allocation.
8144+
</p>
8145+
<p>
8146+
Recursion is an area of active experimentation in Zig and so the documentation here is not final.
8147+
You can read a
8148+
<a href="https://ziglang.org/download/0.3.0/release-notes.html#recursion">summary of recursion status in the 0.3.0 release notes</a>.
8149+
</p>
8150+
<p>
8151+
The short summary is that currently recursion works normally as you would expect. Although Zig code
8152+
is not yet protected from stack overflow, it is planned that a future version of Zig will provide
8153+
such protection, with some degree of cooperation from Zig code required.
8154+
</p>
8155+
{#header_close#}
8156+
8157+
{#header_open|Lifetime and Ownership#}
8158+
<p>
8159+
It is the Zig programmer's responsibility to ensure that a {#link|pointer|Pointers#} is not
8160+
accessed when the memory pointed to is no longer available. Note that a {#link|slice|Slices#}
8161+
is a form of pointer, in that it references other memory.
8162+
</p>
8163+
<p>
8164+
In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers.
8165+
In general, when a function returns a pointer, the documentation for the function should explain
8166+
who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever,
8167+
to free the pointer.
8168+
</p>
8169+
<p>
8170+
For example, the function's documentation may say "caller owns the returned memory", in which case
8171+
the code that calls the function must have a plan for when to free that memory. Probably in this situation,
8172+
the function will accept an {#syntax#}*Allocator{#endsyntax#} parameter.
8173+
</p>
8174+
<p>
8175+
Sometimes the lifetime of a pointer may be more complicated. For example, when using
8176+
{#syntax#}std.ArrayList(T).toSlice(){#endsyntax#}, the returned slice has a lifetime that remains
8177+
valid until the next time the list is resized, such as by appending new elements.
8178+
</p>
8179+
<p>
8180+
The API documentation for functions and data structures should take great care to explain
8181+
the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it
8182+
is to free the memory referenced by the pointer, and lifetime determines the point at which
8183+
the memory becomes inaccessible (lest {#link|Undefined Behavior#} occur).
8184+
</p>
8185+
{#header_close#}
79388186

79398187
{#header_close#}
79408188
{#header_open|Compile Variables#}

0 commit comments

Comments
 (0)