diff options
author | Christian Brauner <brauner@kernel.org> | 2024-05-24 12:19:39 +0200 |
---|---|---|
committer | Christian Brauner <brauner@kernel.org> | 2024-05-28 15:57:23 +0200 |
commit | 620c266f394932e5decc4b34683a75dfc59dc2f4 (patch) | |
tree | dc95222f575b0c775325300759c473e5249beb7b /fs/mount.h | |
parent | ef44c8ab06b300a5b9b30e5b630f491ac7bc4d3e (diff) |
fhandle: relax open_by_handle_at() permission checks
A current limitation of open_by_handle_at() is that it's currently not possible
to use it from within containers at all because we require CAP_DAC_READ_SEARCH
in the initial namespace. That's unfortunate because there are scenarios where
using open_by_handle_at() from within containers.
Two examples:
(1) cgroupfs allows to encode cgroups to file handles and reopen them with
open_by_handle_at().
(2) Fanotify allows placing filesystem watches they currently aren't usable in
containers because the returned file handles cannot be used.
Here's a proposal for relaxing the permission check for open_by_handle_at().
(1) Opening file handles when the caller has privileges over the filesystem
(1.1) The caller has an unobstructed view of the filesystem.
(1.2) The caller has permissions to follow a path to the file handle.
This doesn't address the problem of opening a file handle when only a portion
of a filesystem is exposed as is common in containers by e.g., bind-mounting a
subtree. The proposal to solve this use-case is:
(2) Opening file handles when the caller has privileges over a subtree
(2.1) The caller is able to reach the file from the provided mount fd.
(2.2) The caller has permissions to construct an unobstructed path to the
file handle.
(2.3) The caller has permissions to follow a path to the file handle.
The relaxed permission checks are currently restricted to directory file
handles which are what both cgroupfs and fanotify need. Handling disconnected
non-directory file handles would lead to a potentially non-deterministic api.
If a disconnected non-directory file handle is provided we may fail to decode
a valid path that we could use for permission checking. That in itself isn't a
problem as we would just return EACCES in that case. However, confusion may
arise if a non-disconnected dentry ends up in the cache later and those opening
the file handle would suddenly succeed.
* It's potentially possible to use timing information (side-channel) to infer
whether a given inode exists. I don't think that's particularly
problematic. Thanks to Jann for bringing this to my attention.
* An unrelated note (IOW, these are thoughts that apply to
open_by_handle_at() generically and are unrelated to the changes here):
Jann pointed out that we should verify whether deleted files could
potentially be reopened through open_by_handle_at(). I don't think that's
possible though.
Another potential thing to check is whether open_by_handle_at() could be
abused to open internal stuff like memfds or gpu stuff. I don't think so
but I haven't had the time to completely verify this.
This dates back to discussions Amir and I had quite some time ago and thanks to
him for providing a lot of details around the export code and related patches!
Link: https://lore.kernel.org/r/20240524-vfs-open_by_handle_at-v1-1-3d4b7d22736b@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Diffstat (limited to 'fs/mount.h')
-rw-r--r-- | fs/mount.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/fs/mount.h b/fs/mount.h index 4a42fc68f4cc..4adce73211ae 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -152,3 +152,4 @@ static inline void move_from_ns(struct mount *mnt, struct list_head *dt_list) } extern void mnt_cursor_del(struct mnt_namespace *ns, struct mount *cursor); +bool has_locked_children(struct mount *mnt, struct dentry *dentry); |