-
Notifications
You must be signed in to change notification settings - Fork 77
Remove cast_ref_to_mut
related to FreeList, PageResource and VMMap
#853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cast_ref_to_mut
related to FreeList and PageResourcecast_ref_to_mut
related to FreeList, PageResource and VMMap
github-merge-queue bot
pushed a commit
that referenced
this issue
May 10, 2024
This PR removes the `shared_fl_map` field from `Map32`, and removes `fl_map` and `fl_page_resource` fields from `Map64`. These global maps were used to enumerate and update the starting address of spaces (stored in `CommonFreeListPageResource` instances), for several different purposes. 1. `Map32` can only decide the starting address of discontiguous spaces after all contiguous spaces are placed, therefore it needs to modify the start address of all discontiguous `FreeListPageResource` instances. - `shared_fl_map` was used to locate `CommonFreeListPageResource` instances by the space ordinal. - With this PR, we use `Plan::for_each_space_mut` to enumerate spaces, and use the newly added `Space::maybe_get_page_resource_mut` method to get the page resource (if it has one) in order to update their starting addresses. 2. `Map64` allocates `RawMemoryFreeList` in the beginning of each space that uses `FreeListPageResource`. It needs to update the start address of each space to take the address range occupied by `RawMemoryFreeList` into account. - `fl_page_resource` was used to locate `CommonFreeListPageResource` instances by the space ordinal. - `fl_map` was used to locate the underlying `RawMemoryFreeList` instances of the corresponding `FreeListPageResource` by the space ordinal. - With this PR, we initialize the start address of the `FreeListPageResource` immediately after a `RawMemoryFreeList` is created, taking the limit of `RawMemoryFreeList` into account. Therefore, it is no longer needed to update the starting address later. Because those global maps are removed, several changes are made to `VMMap` and its implementations `Map32` and `Map64`. - The `VMMap::boot` and the `VMMap::bind_freelist` no longer serve any purpose, and are removed. - `Map64::finalize_static_space_map` now does nothing but setting `finalized` to true. The `CommonFreeListPageResource` data structure was introduced for the sole purpose to be referenced by those global maps. This PR removes `CommonFreeListPageResource` and `FreeListPageResourceInner`, and relocates their fields. - `common: CommonPageResource` is now a field of `FreeListPageResource`. - The `free_list: Box<dyn FreeList>` and `start: Address` are moved into `FreeListPageResourceSync` because they have always been accessed while holding the mutex `FreeListPageResource::sync`. - Note that the method `FreeListPageResource::release_pages` used to call `free_list.size()` without holding the `sync` mutex, and subsequently calls `free_list.free()` with the `sync` mutex. The algorithm of `FreeList` guarantees `free()` will not race with `size()`. But the `Mutex` forbids calling the read-only operation `free()` without holding the lock because in general it is possible that a read-write operations may race with read-only operations, too. For now, we extended the range of the mutex to cover the whole function. After the changes above, the `FreeListPageResource` no longer needs `UnsafeCell`. This PR is one step of the greater goal of removing unsafe operations related to FreeList, PageResource and VMMap. See: #853 Related PRs: - #925: This PR is a subset of that PR.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
parent: #850
Task list:
FreeList
and/or useFreeList
in a safe wayFreeListPageResource
andVMMap
.VMMap
, or introduce a more graceful way to allocate and manage chunks.Problem
FreeList
FreeList is not thread-safe
The
FreeList
implementation is ported from JikesRVM. In JikesRVM, theGenericFreeList
base class provided a free list implementation based on an abstract array ofint
(i32
in Rust), but is agnostic to how the array is allocated. It has the abstract methodsgetEntry
andsetEntry
, and concrete table implementations, includingRawMemoryFreeList
andIntArrayFreeList
, implement those methods to access the underlying array. Particularly, inIntArrayFreeList
, the freelist is backed by anint[]
.Primitive FreeList operations are not thread-safe.
GenericFreeList
often separately updates the "low" and "high" bit-fields of eachint
(32-bit integer) Complex operations (such as splitting a region, or merging two adjacent regions) take multiple updates to complete, and they never use any atomic read-modify-write operations.In JikesRVM,
FreeListPageResource.allocPages
holds thePageResource.lock
;FreeListPageResource.freePage
is only executed by one thread. This suggest that theFreeList
itself is not thread-safe, but it is used with external synchronisation.Solution:
FreeList
should be wrapped in aMutex
. ActuallyCommonFreeListPageResource.free_list
is always accessed when holding a mutex (FreeListPageResource::sync
). This will eliminate the need to calltable_mut
because a thread always gets&mut FreeList
when it acquires the Mutex.The PR #925 moves
CommonFreeListPageResource
intosync
.Two-level FreeList structure
The
IntArrayFreeList
has a two level design where children share the same underlying array with the parent. The parent free list has multiple "heads" (1 + NUM_OF_CHILDREN
). The parent uses one of the heads, and each child uses one head.If I read the JikesRVM source code correctly, the "parent head" is owned by the global
Map32
instance, and is protected byMap32.lock
. Each "child head" is owned by aFreeListPageResource
, and is protected byPageResource.lock
. A page resource may acquire pages from the globalMap32
instance, holdingMap32.lock
. This will poll FreeList entries from the doubly-linked list of the "parent head". Then it will insert those entries into the PageResource's "child head", while holdingPageResource.lock
. In this way, the single global array can contain multiple doubly-linked lists starting with each head, and different doubly-linked lists have no overlapping entries. Therefore, it is safe for two different threads to access two different doubly-linked lists using two different heads, even though their entries are allocated in one single contiguous global array.Solution: Use
unsafe
to access the underlying int array if we know different heads lead to non-overlapping linked lists. The implementer ofIntArrayFreeList
is responsible for guaranteeing the safety.Alternative solution: Abandon the current FreeList design, and design an Immix-like page allocator for better locality.
PageResource
Determining the start address of spaces that include RawMemoryFreeList
(Update: Fixed in #953)
On 64-bit systems, page resources use
RawMemoryFreeList
. ARawMemoryFreeList
will occupy a region of address at the start of the space it manages. Therefore, the region of memory available for a space to allocate objects starts after the region theRawMemoryFreeList
occupies. The size of theRawMemoryFreeList
can be determined when it is created.However, the current implementation is overly designed so that it postpones the calculation of the start of the space (the address after the RawMemoryFreeList) after all spaces are created. This is unnecessary. The current implementation records address of FreeList instances in
Map64::{fl_page_resources,fl_map}
and visit it later, which involvesunsafe
andcast_ref_to_mut
.Solution: The address can be eagerly computed if the FreeList implementation is
RawMemoryFreeList
, but the actual FreeList implementation may beRawMemoryFreeList
, and may also beIntArrayFreeList
depending onVMMap
. We can letFreeList
report anOption<Address>
so that it can report its limit eagerly if possible right after creation.PR #925 (closed) uses this approach to eliminate those global maps andunsafe
statements.PR #953 (merged) lets
create_freelist
return aCreateFreeListResult { free_list, space_displacement }
where thespace_displacement
reports the number of bytes to be added to the starting address.Determining the start address of disjoint spaces
(Update: Fixed in #953)
On 32-bit systems, some spaces are contiguous, while other spaces dis-contiguous. When creating a plan, contiguous spaces will reserve address spaces from the bottom or the top of the heap address region. After that, all dis-contiguous spaces will share the remaining address region. This means the start address of the dis-contiguous region cannot be determined until the range of all contiguous spaces are allocated.
The current implementation uses a hack. It registers all
CommonFreeListPageResource
instances in a global list (Map32::shared_fl_map
) so that after the plan is created,Map32::finalize_static_space_map
will update thestart
field of all thoseCommonFreeListPageResource
structs. This is unsafe because eachCommonFreeListPageResource
is referred mutably by bothMap32::shared_fl_map
andFreeListPageResource
.Solution: The actual requirement is the ability to visit all
FreeListPageResource
instances after creating plan, and update theirstart
fields. We can just let Space own PageResource, and we enumerate PageResource by enumerating spaces of the plan. When the plan is just created, we have a unique mutable reference to it. So we have the chance to safely mutate it. We just need to enumerate all spaces and their underlying PageResource, which is possible.PR #925 (closed) uses this approach to eliminate those global maps andunsafe
statements.PR #934 (merged) introduced a way to enumerate spaces in a plan.
PR #953 (merged) uses that method to enumerate space and update the
start
field of the page resources of disjoint spaces.Looking up
RawMemoryFreeList
from address(Update: Fixed in #953)
On 64-bit systems, when a space needs to be expanded, it asks Map64 to map more memory. In
Map64::allocate_contiguous_chunks
, it will bump the high water mark, and, if the page resource is usingRawMemoryFreeList
, attempt to update theRawMemoryFreeList
, too. The problem is,Map64
doesn't know if the current page resource is actually aFreeListPageResource
and if it is actually usingRawMemoryFreeList
. The current code uses a global tableMap64::fl_map
to look upRawMemoryFreeList
from address. The mapMap64::fl_map
itself is unsafe because it holds another reference toRawMemoryFreeList
whileFreeListPageResource
is already holding a reference to it.Solution: We can let the caller pass in a
&mut
to theRawMemoryFreeList
instance it is using. This is possible because the caller is a FreeListPageResource. We just need to downcast theBox<dyn FreeList>
to see if it actually points to aRawMemoryFreeList
. If not, it doesn't need to be updated, either.The PR #925 (closed) letsMap64::allocate_contiguous_chunks
take an extraOption<&mut RawMemoryFreeList>
parameter so that it can update theRawMemoryFreeList
.PR #953 (merged) lets the
PageResource
pass aOption<dyn &mut FreeList>
(if using free list) toVMMap::allocate_contiguous_chunk
so thatMap64
can update it.VMMap
Data structures for managing chunks for different spaces
Map32
implements a "free-list-based discontiguous chunk allocator". Multiple discontiguous spaces can grab chunk from a freelist (region_map
) which manages the shared region. And Map32 keeps track of which space owns which chunk (descriptor_map
), and links "contiguous chunks" together using a linked list (prev_link
andnext_link
).Map64
implements a "monotonic contiguous chunk allocator". For each space, Map64 keeps track of the lower bound and the upper bound of the region of memory the space occupies (base_address
andhigh_water
).Among those data structures:
Map32
:descriptor_map: Vec<SpaceDescriptor>
is per-chunkprev_link: Vec<i32>
is per-chunknext_link: Vec<i32>
is per-chunkregion_map: IntArrayFreeList
is also per-chunkglobal_page_map: IntArrayFreeList
is per-pageMap64
:descriptor_map: Vec<SpaceDescriptor>
is per-spacebase_address: Vec<Address>
is per-spacehigh_water: Vec<Address>
is per-spaceIt is obvious that those data structures manage resources in a centralised way, but those resources may be distributed to different spaces, and will be accessed and modified by different spaces concurrently. The safety of those data structures lie in the fact that different spaces own different portions of those vectors. This is not an idiomatic Rust use pattern, and may need the
unsafe
keyword.For
Map32
, unallocated chunks are effectively owned by theMap32
itself (more precisely, by theregion_map
, a free list that records free chunks). I thinkregion_map
requires a Mutex, while other data structures, such asdescriptor_map
and{prev,next}_link
can be accessed without synchronisation by the space that owns the region.For
Map64
,base_address
andhigh_water
are effectively pre-space metadata. It may be possible to distribute those vectors into Space or PageResource instances.Solution: (1) Add mutex to
region_map
, (2) Use unsafe to access other vectors inMap32
, (3) distribute the per-space metadata (base address and high water) to Space, PageResource, or other per-space data structures.Progress
#893 (merged) eliminates all
cast_ref_to_mut
use cases by usingUnsafeCell
. It has been merged, so mmtk-core will compile on the latest stable Rust version.#925 (closed) tries to refactor the related code to eliminate the need to use unsafe code. Currently it is removing many of the cases mentioned above, but not all.Fixed in #953 removed
shared_fl_map
fromMap32
, andfl_map
andfl_page_resource
fromMap64
. It removed the unsafe operations related toFreeListPageResource
.The text was updated successfully, but these errors were encountered: