Extended attributes
Extended attributes
Posted Jan 3, 2019 22:13 UTC (Thu) by TheGopher (subscriber, #59256)In reply to: Extended attributes by corbet
Parent article: A setback for fs-verity
Posted Jan 3, 2019 22:35 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Jan 3, 2019 22:38 UTC (Thu)
by foom (subscriber, #14868)
[Link] (4 responses)
But, to make xattrs support large data would effectively also require creating a brand new mechanism. It's not quite simple. As the tip of the iceberg, "getxattr" and "setxattr" can only deal with the entire value at once -- not a good idea for a large data stream.
However, other OSes do support this sort of thing, allowing "forks" of the file to be opened for reading/writing just as a normal file. E.g., Windows NTFS has "alternate data streams", and Solaris has "fsattr". (https://docs.oracle.com/cd/E19253-01/816-5175/6mbba7f02/)
Posted Jan 4, 2019 8:43 UTC (Fri)
by epa (subscriber, #39769)
[Link] (3 responses)
Posted Jan 4, 2019 9:01 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Posted Jan 5, 2019 21:22 UTC (Sat)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Jan 7, 2019 2:40 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
Sounds good.
> and then arrange to hide these extra ones from user space.
Posted Jan 4, 2019 12:20 UTC (Fri)
by quotemstr (subscriber, #45331)
[Link] (21 responses)
Posted Jan 4, 2019 14:20 UTC (Fri)
by dskoll (subscriber, #1630)
[Link] (4 responses)
I like this idea, but on the other hand, you now used up two file descriptors each time you open a file, and something has to manage the hidden descriptor. You could also run into weird issues if the second open fails, but then I guess you'd just fail everything.
Posted Jan 4, 2019 19:02 UTC (Fri)
by quotemstr (subscriber, #45331)
[Link] (3 responses)
Posted Jan 4, 2019 20:26 UTC (Fri)
by dskoll (subscriber, #1630)
[Link] (2 responses)
You could generalize this idea to allow multiple data forks. The fs-verity Merkle tree would be a special data fork that could be set once only and then never changed. I think this is a much nicer approach than shoving the verification data at the end of the file.
Posted Jan 4, 2019 20:34 UTC (Fri)
by quotemstr (subscriber, #45331)
[Link] (1 responses)
Posted Jan 4, 2019 20:40 UTC (Fri)
by dskoll (subscriber, #1630)
[Link]
Haha. :) I'm not a kernel developer and have only recently started a job that requires looking deeply into the kernel, so I'm a newbie at this.
Posted Jan 5, 2019 1:44 UTC (Sat)
by himi (subscriber, #340)
[Link] (13 responses)
Posted Jan 5, 2019 7:46 UTC (Sat)
by alonz (subscriber, #815)
[Link] (8 responses)
Posted Jan 5, 2019 8:08 UTC (Sat)
by bof (subscriber, #110741)
[Link] (7 responses)
Keeping the extra streams could be as easy as putting them in a hidden directory of the same filesystem, e.g. ..forx at the root (each stream named after the original inode plus some discriminator for multiple streams of a given file). That should be transparent to any existing fsck.
Instead of xattrs, these /..forx/inum things could even just be directories by themselves, with each (named or whatever) fork getting a regular inode inside. Which would even allow for nested forks, a specific fork being a symlink, device node, have special ownership and permissions, times, .....
A special mount option to ignore the magic, would make the filesystem wholly-copyable/clonable using the usual tools, too.
However, whatever the exact approach, there's the little issue of stuff like "du" reporting underestimated sizes, the bigger issue of teaching any kind of "cp" like command to cope with the forks (including changes to archiver file formats....) - and all of that would only just simply work out for local filesystems, not NFS + friends.
Posted Jan 5, 2019 9:18 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
Posted Jan 5, 2019 22:21 UTC (Sat)
by neilbrown (subscriber, #359)
[Link] (5 responses)
They all hit the cold hard wall of practicality. You cannot create semantics that do what you want.
If you want a file to start acting like a directory, it has to stop acting like a file. One way to achieve this is to "-o loop" mount it somewhere else with an appropriate filesystem driver. You could come up with other approaches, such as an overlay filesystem which presents select files as directories.
Posted Jan 7, 2019 2:48 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (4 responses)
Mac OS X plays tricks like this but I guess they're just directories from its filesystem perspective, all magic in userspace?
https://www.oreilly.com/library/view/mac-os-x/0596004605/...
Posted Jan 7, 2019 19:32 UTC (Mon)
by jccleaver (guest, #127418)
[Link] (3 responses)
HFS (and HFS+, and MFS) all have a native understanding of dual-forked files. It's been a part of the Mac OS world ever since the original 128k Mac in 1984. The problem is that the rest of the computing world mostly had adopted the "a file is a data stream and that's it" model of unix. This is what led to years of problems transferring Mac files around via FTP on the internet and the creation and adoption of formats like MacBinary and BinHex. For MIME purposes, AppleSingle and AppleDouble were used. From RFC1740 https://www.ietf.org/rfc/rfc1740.txt
> Files on the Macintosh consists of two parts, called forks:
It was not uncommon for some files to be entirely resource forks and with empty data forks.
In fact, before the PowerPC era and the increasing complexity of shared libraries and apps generally, it was not uncommon for application programs in the System 6 era to consist entirely of a single file, which could be located where-ever you wanted to. (Imagine if, on the *nix side, all your gettext files and support directories and other miscellaneous crap even console programs might have strewn about the file system were all in a single file on the system.)
Eventually, things got more and more complicated (especially once shared libraries became A Thing on the Mac). Office 6 was a notable monstrosity of filesystem complexity, which made Office 98 a thing of beauty -- you could simply drag the entire Office folder over and that was it.
Note: All of this below is spelled out in Apple technote: https://developer.apple.com/library/archive/technotes/tn/...
By the Mac OS 9 timeframe, a solution was wanted. What ended up being used was a "bundle" bit being set in the fs metadata on certain folders to make them look like files. This hid most of the unnecessary complexity from the user and let them drag things around and manipulate the application as a single entity. If the app didn't need to install something into the System Folder, this meant you could once again think of the app as a single item and treat it accordingly.
It should be pointed out, though that resource forks were still being used here. The bundle-as-single-icon was more for associating multiple *files*, all of which could have both resource and data forks (and other named forks, but this was rarely used). That said, PPC code was located in the data fork now instead of the resource fork, so it was loaded as more or less an unstructured blob rather than structure CODE blocks handled by the Resource Manager.
When Mac OS became Mac OS X, the de-emphasis of resource folks continued. UFS didn't know what to do with them, more and more file transfer on the internet was happening, which meant lots of inefficient BinHex encoding, and command line tools had barely any concept of the closest parallel: Windows NTFS "streams", which were barely used as well. The solution eventually developed was to treat Resource Forks on filesystems that didn't support them the same way as had been done on FS.
The NeXT-ization of Apple had many benefits, but IMO this was a step backwards. Rich metadata like resources provided something few other OS's had, but made things actually *simpler* on disk as long as the tools and apps knew what to expect. Sadly, Apple didn't even patch the low-level BSD cp/mv commands to understand multiple fork files until 10.4, so it's clear this was the direction things were going to go toward. Thus we end up in a world where Bundles and Packages exist and native multiple-fork files are rare. See https://developer.apple.com/library/archive/documentation... for details on how it looks from an OS X perspective.
> Mac OS X plays tricks like this but I guess they're just directories from its filesystem perspective, all magic in userspace?
TL;DR: Folders that appear as a single file, but can be drilled down into in some places (including the Finder).
Hope that helped.
Posted Jan 8, 2019 3:13 UTC (Tue)
by ghane (guest, #1805)
[Link] (1 responses)
So AppImage as done by Apple many years earlier? :-)
--
Posted Jan 8, 2019 8:38 UTC (Tue)
by jccleaver (guest, #127418)
[Link]
Well, more or less. Apple obviously didn't have the true chaos of distros and *nix variants to deal with, just the (brilliant) 68k->PPC transition and the (very well done) PPC->Intel transition, so packaging was never *truly* horrible on the Mac compared to pretty much any other system (especially Windows). OS X application bundles handled multiple architectures with universal binaries almost as a trivial after-thought compared to all the other stuff that was now existing as a separate (hidden) file.
Having complex (eg, multi-stream, "Resource Manager"-accessed) files but far fewer of them made for a much more grokable operating system than what others have had to deal with. The classic Mac system software had no command line, and while graphical linux environments try to get by with that, *nix systems are still dealing with thousands or 10's of thousands of files on a fresh install. That's a lot of complexity to try to paper back over. Even at the worst "7.5.3 Update 2" era Mac OS complexity, you were still only dealing with a couple of hundred files on a new install, max. And if it weren't for System Enablers (basically hardware support files for each released Apple computer family model) it would have been far less.
The Mac OS -> OS X transition was rued by many a classic Mac fan for the interface changes, but more fundamental was the knowledge that we were going from a fundamentally *more simple* system to a complex one that would have to sort of simulate a simple one. FlatPacks, AppImage, and containers generally are all ways to try to get back to that sort of mental simplicity (at the expense of system-management issues such as duplicated libraries or fully static binaries). But there's a place for all kinds of paradigms out there, and smashing them together because Devs can't grok *nix can't really ever achieve the best of either world: the fine-tuning of professional system administration of a complex system, or the ease-of-use and only-a-certain-number-of-things-that-could-be-going-wrongness of a system with fewer parts.
Posted Jan 17, 2019 19:21 UTC (Thu)
by kevinkrouse (guest, #86616)
[Link]
Posted Jan 8, 2019 15:41 UTC (Tue)
by mina86 (guest, #68442)
[Link] (3 responses)
Posted Jan 8, 2019 20:55 UTC (Tue)
by nybble41 (subscriber, #55106)
[Link] (2 responses)
I'm envisioning something more like The resulting FD could then be passed as dirfd to openxattrat() (with an empty path) or to flistxattr()/fgetxattr()/fsetxattr() to access the xattrs of the resulting inode, recursively.
Posted Jan 9, 2019 1:26 UTC (Wed)
by foom (subscriber, #14868)
[Link] (1 responses)
Posted Jan 10, 2019 4:05 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
Posted Jan 8, 2019 15:45 UTC (Tue)
by mina86 (guest, #68442)
[Link] (1 responses)
Posted Jan 9, 2019 17:16 UTC (Wed)
by quotemstr (subscriber, #45331)
[Link]
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Why? Aren't stream typed on some way?
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Extended attributes
Extended attributes
It can't be 100% transparent: you need the various integrity-validation mechanisms (fsck and its online brethren) to be aware, so they won't consider these inodes to be orphaned.
Extended attributes
Extended attributes
Extended attributes
There were several attempts to make "file-as-directory" thingies (I remember one in Reiser4). Whatever happened to them?
Extended attributes
Extended attributes
Resource Forks, Bundles, etc.
>
> Data fork: The actual data included in the file. The Data
> fork is typically the only meaningful part of a
> Macintosh file on a non-Macintosh computer system.
> For example, if a Macintosh user wants to send a
> file of data to a user on an IBM-PC, she would only
> send the Data fork.
>
> Resource fork: Contains a collection of arbitrary attribute/value
> pairs, including program segments, icon bitmaps,
> and parametric values.
>
> Additional information regarding Macintosh files is stored by the
> Finder in a hidden file, called the "Desktop Database".
>
> Because of the complications in storing different parts of a
> Macintosh file in a non-Macintosh filesystem that only handles
> consecutive data in one part, it is common to convert the Macintosh
> file into some other format before transferring it over the network.
>
> The two styles of use are [APPL90]:
>
> AppleSingle: Apple's standard format for encoding Macintosh files
> as one byte stream.
> AppleDouble: Similar to AppleSingle except that the Data fork is
> separated from the Macintosh-specific parts by the
> AppleDouble encoding.
>
> AppleDouble is the preferred format for a Macintosh file that is to
> be included in an Internet mail message, because it provides
> recipients with Macintosh computers the entire document, including
> Icons and other Macintosh specific information, while other users
> easily can extract the Data fork (the actual data) as it is separated
> from the AppleDouble encoding.
Resource Forks, Bundles, etc.
Sanjeev
Resource Forks, Bundles, etc.
Resource Forks, Bundles, etc.
If I understand you correctly, you’re suggesting a Extended attributes
iopen(int inode, int flags, mode_t mode)
syscall. If that’s the case, the problem is that it would allow bypassing filesystem permissions. Namely, it would render execution bit of a directory useless since user would be able to read a world-readable file even if it resides in directory they have no access to.
Extended attributes
openxattrat(int dirfd, const char *path, const char *name, int flags, mode_t mode)
—the link to the internal xattr inode would be hidden in the filesystem and you would need at least search access to the file to open the linked xattr inode. User-mode software would never handle the raw inode numbers.Extended attributes
Extended attributes
Changes to Extended attributes
unlink(2)
would break rm *
though. E.g. if I run rm file file.xattr
I’ll get an error deleting the second file since it got deleted transparently.
Extended attributes