-
Notifications
You must be signed in to change notification settings - Fork 2.6k
support recursive read only (rro
) option for mounts
#25680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,6 +36,7 @@ import ( | |
"github.com/containers/podman/v5/utils" | ||
"github.com/containers/storage/pkg/idtools" | ||
spec "github.com/opencontainers/runtime-spec/specs-go" | ||
"github.com/opencontainers/runtime-spec/specs-go/features" | ||
"github.com/sirupsen/logrus" | ||
"golang.org/x/sys/unix" | ||
) | ||
|
@@ -66,6 +67,7 @@ type ConmonOCIRuntime struct { | |
supportsNoCgroups bool | ||
enableKeyring bool | ||
persistDir string | ||
features *features.Features | ||
} | ||
|
||
// Make a new Conmon-based OCI runtime with the given options. | ||
|
@@ -131,6 +133,12 @@ func newConmonOCIRuntime(name string, paths []string, conmonPath string, runtime | |
break | ||
} | ||
|
||
features, err := runtime.getOCIRuntimeFeatures() | ||
if err != nil { | ||
return nil, fmt.Errorf("getting %s features: %w", runtime.name, err) | ||
} | ||
runtime.features = features | ||
Comment on lines
+136
to
+140
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO this is not acceptable for performance reasons. Every podman command would then have to run this even when the vast majority of them will never need this. Second how many runtimes actually support the features command? Right not this hard error which means it can cause regressions. I see three options:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have also, in the past, added fields to containers.conf to indicate which runtimes supported certain features - see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternatively, I wonder if we could lazy-initialize this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
IMO we could cache the result under the tmpdir using the runtime path as the key, and then use the runtime binary stat ( What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is my suggest 2, using sync.OnceValues()
How so? Any command that does not start a container will certainly never need it. podman stop, podman rm, podman ps, podman logs, etc... How I look at this only the minority will actually need this.
Yes that sounds good to me, there is an overhead for stat and reading the file but compared to the rest we do that is likely not a problem. Should be much better than spawning a new process each time. Even worse reading the code again we actually call newConmonOCIRuntime() on all runtimes from containers.conf so we pay the overhead multiple times for all runtimes which really doesn't feel right. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I was referring only to the commands that run a container, and that is where we care more about performance. On my system I've:
a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (sorry for the multi-week delay in response)
I prefer lazy-initializing the runtime feature over the other two options suggested by @Luap99 as well. As far as caching the output under tmpdir, I'm under the assumption--as @giuseppe pointed out--that it's to avoid a situation where the runtime changes. It shouldn't make too much of a dent in terms of performance. |
||
|
||
// Search the $PATH as last fallback | ||
if !foundPath { | ||
if foundRuntime, err := exec.LookPath(name); err == nil { | ||
|
@@ -839,6 +847,11 @@ func (r *ConmonOCIRuntime) SupportsKVM() bool { | |
return r.supportsKVM | ||
} | ||
|
||
// Features returns the features struct from the OCI runtime | ||
func (r *ConmonOCIRuntime) Features() *features.Features { | ||
return r.features | ||
} | ||
|
||
// AttachSocketPath is the path to a single container's attach socket. | ||
func (r *ConmonOCIRuntime) AttachSocketPath(ctr *Container) (string, error) { | ||
if ctr == nil { | ||
|
@@ -1485,6 +1498,20 @@ func (r *ConmonOCIRuntime) getConmonVersion() (string, error) { | |
return strings.TrimSuffix(strings.Replace(output, "\n", ", ", 1), "\n"), nil | ||
} | ||
|
||
func (r *ConmonOCIRuntime) getOCIRuntimeFeatures() (*features.Features, error) { | ||
var features *features.Features | ||
output, err := utils.ExecCmd(r.path, "features") | ||
if err != nil { | ||
return features, err | ||
} | ||
|
||
if jsonErr := json.Unmarshal([]byte(output), &features); jsonErr != nil { | ||
return features, err | ||
} | ||
|
||
return features, nil | ||
} | ||
|
||
// getOCIRuntimeVersion returns a string representation of the OCI runtime's | ||
// version. | ||
func (r *ConmonOCIRuntime) getOCIRuntimeVersion() (string, error) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ import ( | |
"github.com/containers/common/pkg/resize" | ||
"github.com/containers/podman/v5/libpod/define" | ||
spec "github.com/opencontainers/runtime-spec/specs-go" | ||
"github.com/opencontainers/runtime-spec/specs-go/features" | ||
"github.com/sirupsen/logrus" | ||
) | ||
|
||
|
@@ -194,6 +195,11 @@ func (r *MissingRuntime) SupportsKVM() bool { | |
return false | ||
} | ||
|
||
// Features returns nil since this is a missing runtime | ||
func (r *MissingRuntime) Features() *features.Features { | ||
return nil | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should still return an error, so I recommend (*features.Features, error) - the missing runtime can have methods called on it and we want to avoid null pointer panics |
||
} | ||
|
||
// AttachSocketPath does not work as there is no runtime to attach to. | ||
// (Theoretically we could follow ExitFilePath but there is no guarantee the | ||
// container is running and thus has an attach socket...) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,21 +6,28 @@ import ( | |
"io/fs" | ||
"os" | ||
"path/filepath" | ||
"slices" | ||
"strconv" | ||
"strings" | ||
"sync" | ||
"syscall" | ||
|
||
"github.com/containers/podman/v5/libpod/define" | ||
"github.com/containers/podman/v5/pkg/rootless" | ||
"github.com/containers/psgo" | ||
spec "github.com/opencontainers/runtime-spec/specs-go" | ||
"github.com/opencontainers/runtime-spec/specs-go/features" | ||
"github.com/opencontainers/runtime-tools/generate" | ||
"github.com/sirupsen/logrus" | ||
"golang.org/x/sys/unix" | ||
) | ||
|
||
var ( | ||
errNotADevice = errors.New("not a device node") | ||
errNotADevice = errors.New("not a device node") | ||
errKernelDoesNotSupportRRO = errors.New("kernel does not support recursive readonly mount option `rro`") | ||
errRuntimeDoesNotSupportRRO = errors.New("runtime does not support recursive readonly mount option `rro`") | ||
|
||
kernelSupportsRROOnce sync.Once | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. new code should use |
||
) | ||
|
||
// GetContainerPidInformationDescriptors returns a string slice of all supported | ||
|
@@ -258,3 +265,70 @@ func DeviceFromPath(path string) (*spec.LinuxDevice, error) { | |
Minor: int64(unix.Minor(devNumber)), | ||
}, nil | ||
} | ||
|
||
// kernelSupportsRecursivelyReadOnly returns true if the kernel supports recursive readonly mounts | ||
// from https://github.com/moby/moby/blob/master/daemon/daemon_linux.go#L222 | ||
func kernelSupportsRecursivelyReadOnly() error { | ||
fn := func() error { | ||
tmpMnt, err := os.MkdirTemp("", "podman-detect-rro") | ||
if err != nil { | ||
return fmt.Errorf("failed to create a temp directory: %w", err) | ||
} | ||
for { | ||
err = unix.Mount("", tmpMnt, "tmpfs", 0, "") | ||
if !errors.Is(err, unix.EINTR) { | ||
break | ||
} | ||
} | ||
if err != nil { | ||
return fmt.Errorf("failed to mount tmpfs on %q: %w", tmpMnt, err) | ||
} | ||
defer func() { | ||
var umErr error | ||
for { | ||
umErr = unix.Unmount(tmpMnt, 0) | ||
if !errors.Is(umErr, unix.EINTR) { | ||
break | ||
} | ||
} | ||
if umErr != nil { | ||
logrus.Errorf("Failed to unmount %q: %v", tmpMnt, umErr) | ||
} | ||
}() | ||
Comment on lines
+273
to
+297
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please use |
||
attr := &unix.MountAttr{ | ||
Attr_set: unix.MOUNT_ATTR_RDONLY, | ||
} | ||
for { | ||
err = unix.MountSetattr(-1, tmpMnt, unix.AT_RECURSIVE, attr) | ||
if !errors.Is(err, unix.EINTR) { | ||
break | ||
} | ||
} | ||
// ENOSYS on kernel < 5.12 | ||
if err != nil { | ||
return fmt.Errorf("failed to call mount_setattr with AT_RECURSIVE: %w", err) | ||
} | ||
return nil | ||
} | ||
|
||
kernelSupportsRROOnce.Do(func() { | ||
errKernelDoesNotSupportRRO = fn() | ||
}) | ||
return errKernelDoesNotSupportRRO | ||
Comment on lines
+314
to
+317
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can simplify this by using |
||
} | ||
|
||
// SupportsRecursiveReadonly returns true if the runtime supports recursive readonly mounts | ||
func SupportsRecursiveReadonly(features *features.Features) error { | ||
if err := kernelSupportsRecursivelyReadOnly(); err != nil { | ||
return err | ||
} | ||
|
||
if features == nil || features.MountOptions == nil { | ||
return errRuntimeDoesNotSupportRRO | ||
} | ||
if !slices.Contains(features.MountOptions, "rro") { | ||
return errRuntimeDoesNotSupportRRO | ||
} | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check
rro
here? It seems like we should always pass it along unmodified, even if unsupported - if the user explicitly requestedrro
and we silently downgrade, that seems like a problem.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, rro must be passed as it is.
Also we need to have a way to restore the previous behavior. If a user really wants a
ro
, there is no way to achieve it when we always forcero
->rro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downgrade shouldn't happen. The intent for an
rro
check here is to upgradero
torro
when supported and to error out when a user explicitly requestsrro
and it isn't supported.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why? I can understand for backwards compat but that is wrong from a security point of view. I don't know how bad the impact would be when we roll out this change, but we can introduce a non-recursive (
norro
) as part of the transition till the buggyro
behaviour fades out. Wdyt?