Skip to content

fs.Dir.deleteTree: Add configurable max_retries to avoid infinite loops #15467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

squeek502
Copy link
Collaborator

@squeek502 squeek502 commented Apr 25, 2023

Previously, it was possible for deleteTree to get into a state where it would continually retry deleting the same directory over and over infinitely, since it would think all of the children were successfully deleted but when deleting the directory it would hit error.DirNotEmpty. Now, there is a (user-configurable) limit to the number of times it will retry deleting directories, and return error.TooManyRetries if the limit is exceeded.

Closes #15465, possibly related to #14948

Follow up to #13073 (comment) (cc @matu3ba)


Notes:

  • usize for max_retries is definitely overkill, but using a smaller type didn't affect @sizeOf(StackItem) in deleteTree so I just stuck with usize
  • Adding the options parameter allowed for making kind_hint configurable which can be used to save a syscall if you know what the type of sub_path is (it's defaulted to .File since that's how it's always worked, but maybe .Directory would be a better default? Or maybe callsites should just pass .Directory more often?)
  • I believe the added test relies on the DeleteFile behavior added in windows: use NtSetInformationFile in DeleteFile. #15316 (DELETE_PENDING treated as success), relevant issue: unhandled NTSTATUS: DELETE_PENDING in std.os.windows.DeleteFile #6452
  • It may make some sense to do something like:
        self.parent_dir.deleteTree(&self.sub_path, .{}) catch |err| switch (err) {
            error.TooManyRetries => {
                @panic("too many retries while deleting tmp dir; it is either being continually modified or a file handle was leaked during the test");
            },
            else => {},
        };

in TmpDir.cleanup

Previously, it was possible for deleteTree to get into a state where it would continually retry deleting the same directory over and over infinitely, since it would think all of the children were successfully deleted but when deleting the directory it would hit `error.DirNotEmpty`. Now, there is a (user-configurable) limit to the number of times it will retry deleting directories, and return `error.TooManyRetries` if the limit is exceeded.

Closes ziglang#15465
@squeek502 squeek502 force-pushed the delete-tree-retries branch from 608ab05 to c63897b Compare April 26, 2023 02:12
It's not quite the same as deleteTree since it doesn't keep track of the stack of directories that have been tried, but in normal conditions (i.e. directories aren't being created within the to-be-deleted tree) it should behave similarly.
@andrewrk andrewrk self-assigned this Apr 29, 2023
@andrewrk andrewrk removed their assignment Oct 12, 2023
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not a great way to design the API. There should not be a concept of retrying to delete something, it's just not how file systems are meant to be modeled.

DELETE_PENDING should cause the function to block until the deletion is no longer pending. It would be acceptable for an option to be a maximum amount of time to wait before it returns an error for the pending deletion to not complete in time.

As for an adversarial agent which has access to the file system, I don't think this is worth modeling at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fs.Dir.deleteTree has no upper bound on retries
3 participants