|
|
Subscribe / Log in / New account

Change notifications for network filesystems

By Jake Edge
May 25, 2022

LSFMM

Steve French led a discussion on change notifications for network filesystems in a session at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM). He is part of the Samba team and noted that both Windows and macOS clients get notified of new and changed files in a shared directory immediately, while on Linux that does not happen. He wanted to explore what it would take to add that functionality.

On Windows and macOS, a file browser automatically shows changes to files in shared network filesystems, but at some point that broke for Linux clients. The inotify mechanism (and its predecessor, dnotify) were added to the kernel to support the Samba server, he said. Remote systems that are talking to a Samba server on Linux can see those kinds of changes, but remote Linux clients cannot.

The client API changed at some point so that network filesystems have no easy way to register to receive these kinds of events. For SMB, he added an ioctl() command that can be used wait on notifications of these changes. But in order to use that, all of the client programs would need to change to make a filesystem-specific call in to get that information.

[Steve French]

The underlying problem is that the filesystem servers are not told that a Linux client wants to be notified of changes. That means Linux file browsers do not have the functionality that Windows and Mac users have come to expect. The inotify functionality does not have a hook into Ceph, AFS, or SMB to make them aware that a client wants notifications, he said. Chuck Lever noted that NFS has the notification capability in the protocol, but, like the others, it is not implemented for Linux.

There is also the fanotify API, French said, but he does not know if it would be useful for what he is looking for. Amir Goldstein said that fanotify was originally created by antivirus vendors but that, more recently, work has gone into it to add more functionality. As of about Linux 5.10, fanotify provides almost a superset of the inotify functionality.

One big feature that inotify lacks has been implemented in fanotify: watching an entire filesystem. There are not many applications that use it, because it is new, Goldstein said. He has added fanotify support to inotify-tools and its library, so there are now user-space tools that can be used to watch a filesystem or set of files using the fanotify API.

There are many types of events that an SMB client can get from the server to tell it about changes to timestamps, file creation, file name changes, file deletion, and so on, French said. Those all seem to map reasonably well to fanotify/inotify events; changes to the access-control lists (ACLs) is not supported but might need to be, he said. Goldstein said that if there is enough interest, event types can be added to fanotify.

On Linux, David Howells said, the file notifications are mostly used by desktop file managers. KDE starts a daemon to monitor changes and GNOME does something similar, he said; if notifications are not available, then they poll for the information. Goldstein said that it is not that notifications are not available, just that they are not granular enough and that there may be some kinds of changes that do not have notification events, so polling is used for those cases.

Goldstein said that French had been asking for this feature for a long time. The FUSE developers "took a shot at implementing something", he said; it added inotify support for virtiofs. On the Zoom link, Vivek Goyal, who was involved in that work, said that inotify was chosen because it is simpler than fanotify. Whatever notification watches are placed on the local file are forwarded to the remote file server, which sets up inotify and forwards events back to the local filesystem. Based on the feedback on those patches, Goyal said, he has been trying to rework the patches to use fanotify but ran into a number of difficulties. There may be more limitations when using fanotify. French said that it is important to get a handle on what exactly can be supported because the alternative is "really painful": polling.

Jan Kara, also via Zoom, said that it should be fairly straightforward to add the hook for filesystems to inform them that a watch has been added; in the simplest case, the filesystem just says that it does not support the feature. The more difficult part is that when the filesystem receives an event and wants to get it to the client filesystem in a way that user space can receive it via fanotify or inotify. For inotify, the inode number and file name are available to send to the client, but that is not true for fanotify, where you may only have the inode number. Goyal agreed that was the problem for virtiofs.

The important thing is to provide a generic mechanism for filesystems so that applications do not have to use multiple filesystem-specific interfaces to get this information, French said. He also wants to avoid polling, which is particularly expensive when done across the network. Josef Bacik said that it seemed reasonable to add the hook to let the filesystems know when a watch has been added; it is up to French and Goyal to work out the details on that.

Howells asked about subtree watches; on Windows you can get notified for changes within a subtree. He wondered if fanotify could add support for that. Goldstein said that it is something that everyone wants, but it is not trivial to do; several attempts have been made over the years, but nothing has been added.

French said that the feature he is looking for is an asynchronous, non-perfect mechanism. Some filesystems, such as SMB and NFS, have strict approaches using delegations or leases to ensure that all events are seen, but that is not usually worth the cost. Those could be used to implement these change notifications, but it should be left up to the filesystem to decide that, he said.

As time wound down, French also wanted to mention that he had not seen any tests for inotify and fanotify in xfstests (which are being renamed to "fstests"). It will be important to have tests to ensure that nothing breaks when the remote notifications are added. But Goldstein said that the tests for notifications are part of the Linux Test Project (LTP) tests. There is a test there for every new feature and regression tests for bugs that have been fixed. Ted Ts'o said that xfstests have historically been used by the developers of different filesystems, while features that were implemented in the virtual filesystem (VFS) layer were tested in LTP. That may need to change as the network filesystems add features to support notifications.


Index entries for this article
Kernelfanotify
KernelFilesystems/Network
KernelInotify
ConferenceStorage, Filesystem, Memory-Management and BPF Summit/2022


to post comments

Change notifications for network filesystems

Posted May 26, 2022 20:53 UTC (Thu) by Kamilion (subscriber, #42576) [Link]

Wait a second, I thought we 'got rid' of inotify *years* ago by stubbing it out to call fanotify?

Change notifications for network filesystems

Posted May 27, 2022 9:05 UTC (Fri) by Fowl (subscriber, #65667) [Link]

It seems remarkable to me that there doesn’t seem to be an abstraction on the user space APIs in the VFS layer? Or is the overlap in functionality so small that it wouldn’t be worth it?

Change notifications for network filesystems

Posted May 29, 2022 21:37 UTC (Sun) by mtodorov (guest, #158788) [Link] (3 responses)

It is however disabling not to know which user attempted to open or access a file, or caused file event.

The prudent approach may be to add uid, gid, real and effective user ids to the structure:

           struct fanotify_event_metadata {
               __u32 event_len;
               __u8 vers;
               __u8 reserved;
               __u16 metadata_len;
               __aligned_u64 mask;
               __s32 fd;
               __s32 pid;
           };
It doesn't seem to have to break anything, since programs rely on even_len rather than sizeof (struct fanotify) to get data.

Rationale: it is possible to lookup which user is the owner of the PID, however, while that information is being searched for, the process may exit already.

It also involves a race condition. And it may not be the same which user we give an access to file to. Lookup in /proc file system is expensive and inefficient. :-(

My $0.02.

Change notifications for network filesystems

Posted May 29, 2022 21:58 UTC (Sun) by mtodorov (guest, #158788) [Link]

P.S.

Please pardon my typo and imprecision, this should say:

"It doesn't seem to have to break anything, since programs rely on event_len rather than sizeof (struct fanotify_event_metadata) to get data."

Change notifications for network filesystems

Posted May 30, 2022 9:15 UTC (Mon) by taladar (subscriber, #68407) [Link] (1 responses)

Wouldn't that be difficult for network filesystems in particular. You don't really have uids that are the same across the whole network filesystem scope (server + all clients).

Change notifications for network filesystems

Posted May 30, 2022 11:07 UTC (Mon) by mtodorov (guest, #158788) [Link]

IMHO, from the security point of view, it would be very useful i.e. to know which user is trying to modify /bin/bash on local filesystem. If this is a user named jdoe@localhost, and he is not one of the admins, then Huston we have a problem!

IMNSHO, the network file system's integrity should be the responsibility of the NSF, SMB or other server ...

A process could request (for example) IN_EVENT_UID in the list of events listened to, and the fs driver could reply with EINVAL or perhaps even more distinctive EREMOTE (Object is remote).

Change notifications for network filesystems

Posted Dec 9, 2023 14:25 UTC (Sat) by bonassis (guest, #168417) [Link]

Hi.

Good to see there is attention for fsnotify methods (inotify, fanotify) and network filesystems. Earlier I've been pretty busy why this does not work on Linux, and wrote this about it (only FUSE):

https://github.com/libfuse/libfuse/wiki/Fsnotify-and-FUSE

One citation about why it does not work: "In short: the individual filesystems do not "know" a watch has been set, and thus cannot react on that."
I've written some patches back then, which made the fsnotify subsystem in Linux informs the VFS fuse kernel module when a watch has been set (or changed or removed). That worked, but later I came to the conviction that handling of this is far better of in userspace, than in the kernel.

In network filesystems, you are working in an network enviroment right? And when you want fsnotify to work in an network environment, it's because you want that applications like a filemanager, but also officesuites for example are informed about changes made by others, on other hosts (because changes made on the same host you are working on does work ...).
Now I think that users are not only interested in seeing a simple event like a file is created in a watched folder, but also by who (a user in username@domain.org notation) from what host. You can do a lot in the kernel, but you can never let a network filesystem in the kernel pass through this information. A better way to do this is doing this in userspace.

Some time ago there was a special service for that (FAM = File Alteration Monitor) but that is not used anymore.

The way it should work in my opinion:
- there is a dedicated service which offers a fs change notify service to applications
- applications can ask (via mask) what info they want: apart from pretty standard events like a file is added, removed or changed/modified, also by who, from what host and time.
- this service checks the filesystem the watch has been set upon: if its a not a networkfs and not a fuse fs use the native fs method, which is fanotify at this moment for Linux. This method provides ways to determine who (via pid) caused the event.
- when dealing with a FUSE fs, the daemon is running in userspace, then it should be not so hard to forward the watch request to this daemon. This daemon can than (if it supports this, otherwise fallback to default which is fanotify) reply what info it can handle, and possibly provide the information when an event on the backend occurs.

I'm working in a set of software (OSNS, https://github.com/stefbon/OSNS) based upon the SSH protocol, MDNS and FUSE and this is doable.
With a network filesystem in the kernel (cifs) this is harder.

I'm very interested in what you think,

best regards,

S. Bon
the Netherlands


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds