|
|
Log in / Subscribe / Register

Garrett: ext4, application expectations and power management

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 16:04 UTC (Sun) by smoogen (subscriber, #97)
Parent article: Garrett: ext4, application expectations and power management

I am confused.. when did fsync() not become a standard operation in writing to disks? Even when writing tons of little files? That was pretty much drummed into us in the 1980's and early 1990's. write(),fsync(), write(), fsync(). And always fsync() before a close().


to post comments

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 16:27 UTC (Sun) by drag (guest, #31333) [Link] (29 responses)

The trick here is that you do NOT want your application to write data to disk all the time.

When you go:
1. Create file 1
2. Write file 1
3. (do proccessing for some amount of time)
4. Create file 2
5. Write file 2
6. Rename file 2 over file 1

This is the application telling the OS that it wants either good old data or good new data. That in the crash the application developer is trying to ensure that all data is not lost.

Also this sort of technique is used for lots of other reasons.. You can run into situations were multiple applications want to write the same file and this sort of technique helps make sure that they don't corrupt each other's data. Also this is useful for applications that may crash themselves. If they segfault or otherwise go apeshit while writing out a new file this allows them to fail ungracefully without corrupting the old file.

So it's not really much to do with how the file system works on the block-to-filesystem layer of things; it's ment to deal with how the OS works on the filesystem-to-application layer of things. The applications should be unaware and not really care a whole lot about the lower levels as long as they don't do any pathalogically bad behavior.

So remember the goal is "either good new data or good old data".

When my desktop is writing out all those .gconf files I don't really care about that data. If I crash my system I expect to loose those preferences. That's normal, but I don't want to end up with a desktop that won't work at all due to a bunch of corrupt "registry" files.

If you use fsync() on all of that then this means that every time I make a change to the UI or some settings then my disk is going to spin up. When using fsync() that is telling the OS that "I only want good new data" and this is a much more strigent requirement and thus has much heavier impact on system as a whole.

The only time, I as a user, expect and want that behavior is when writing out important files or doing important system tasks. Like writing out a file with OO.org or editing /etc files with vim or whatnot.

When that happens I don't care so much about the disk spinning up, or that I am using a entire block of my flash drive to store a 1k file, because that small file is important...

-------------------

It took me a while to understand this. Fsync is just to big of a hammer for what many applications need to do, or desire.

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 16:33 UTC (Sun) by smoogen (subscriber, #97) [Link] (27 responses)

Ah ok. But renaming files should always have an fsync in it.. shouldn' it. I mean thats a point where you WANT the new data not the old data.

On the other hand reading the notes in his blog I can see why people have complained about xfs 'corruptions' with some of their applications that they didn't see with ext3. I wonder what reiserfs and jfs did with this sort of data writes. [I think I know what ntfs does.. its in the same hole as xfs.]

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 16:49 UTC (Sun) by drag (guest, #31333) [Link]

> Ah ok. But renaming files should always have an fsync in it.. shouldn' it. I mean thats a point where you WANT the new data not the old data.

Well it depends. If your data is not that important then it's not that important. If the data your dealing with is not that important then losing the latest changes isn't going to be the end of the world. Who cares really if Epiphany missed that last bit of history in the last 60 seconds or so? Who cares that I just set the default font to Arial in the last 30 seconds? Just as long as my preferences and history is not wiped out completely and is corrupted to the point were nothing will run without me going in and cleaning up the bad files.

It's a trade-off.

If the data is more important then good battery life, good disk performance, well laid out data on the block device, or the long life of your flash drive, (etc), then running fsync() all the time is what you want.

Some data is that important, other data not so much.

If all data was equaly critically important then it's just better to run in "sync" mode all the time and just take the hit.

-------------

And remember that in Linux system with stable drivers and decent hardware then running fsync() before a rename gains you almost nothing.

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 19:00 UTC (Sun) by kasperd (guest, #11842) [Link] (25 responses)

But renaming files should always have an fsync in it.. shouldn' it. I mean thats a point where you WANT the new data not the old data.
No, even at that point it might not be important to get the new data to disk right away. But it is still important to get either the old or the new data. However ext4 leaves the disk contents with something that didn't exist at any point during the changes the process made, this is because ext4 did the rename before writing the data.

The problem is, that there exist no API that guarantee exactly the level of integrity needed in many cases. You used to be able to create a file and then rename it on top of an existing file to get what you wanted. The change forced you to sync to get the guarantee that you needed, but it gave you more than you wanted and was slower because of that.

So there are three possibilities:
  1. You want to get the new data on disk as quickly as possible and wait for it to get there.
  2. You want some good data on disk either the old or the new, but you don't care which you get in case the system crashes in the next minute or so, as long as it doesn't take too long for the data to make it to disk.
  3. You don't care about the data at all, it is perfectly fine for it to get lost or corrupted.
I'd say 3 is unlikely to be desired by many applications, why would you be writing the file in the first place if you didn't care about the data? But right now there is no API to give you 2, the one that used to give you that now give you 3.

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 20:19 UTC (Sun) by drag (guest, #31333) [Link] (10 responses)

I was thinking a bit more about it.

Basically people want 3 levels of data integrity in applications (paraphrasing what you and other people are saying):

1. High priority: Write data _now_. All data is safe in case of system failure.

2. Normal priority: Ensure no corruption of existing data in case of system failure.

3. Low priority: temporary data that will get used for a session. No requirements for preserving data in case of system failure.

Ext4 (as it existed) can only provide 1 or 3, but not 2.

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 20:32 UTC (Sun) by smoogen (subscriber, #97) [Link] (3 responses)

Of the file systems is EXT3 the only one that does give that promise (and only by accident as it was an unintended consequence)?

xfs would seem not to and btrfs not to (going from the original blog post. I don't know about all types of reiserfs or jfs.

I am not saying the 'promise' is not important.. but it might be one that file-system developers should be aware that people want versus what they think people should expect :)

Garrett: ext4, application expectations and power management

Posted Mar 15, 2009 23:41 UTC (Sun) by drag (guest, #31333) [Link] (2 responses)

Ya. It seems to me that Ext3 only works that way by accident.

But it seems that for consumer devices this sort of behavior could actually be a fundamental design improvement over the way file systems have traditionally worked and could be advertised as a actual selling point (that is being able to do promise # 2. reliably.)

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 16:12 UTC (Mon) by jspaleta (subscriber, #50639) [Link]

I've always wondered.. how many of the more important or more impactful improvements in technology in the long view of history were simply uncharacteristically happy accidents versus premeditated "design" decisions.

-jef

Garrett: ext4, application expectations and power management

Posted Mar 19, 2009 23:28 UTC (Thu) by jzbiciak (guest, #5246) [Link]

Not really by accident. I believe the necessary dependence is established by the "data=ordered" mount option. That's pretty much what we need to fix this issue: Make sure that the data is on the disk before you write the updated metadata.

That doesn't mean you need to flush things to the disk early. It just means that things have to happen in a particular order.

The three levels of write priority

Posted Mar 16, 2009 7:41 UTC (Mon) by rvfh (guest, #31018) [Link]

I like the three levels of commit priority you set, and I would rather this was an open() option than a application decision to call fsync() (and when in case 2.?)

1. O_COMMITQUICK commit to disk every 5 seconds
2. O_COMMITNORMAL commit to disk every 30 seconds
3. O_COMMITLAZY commit to disk only if need be, or maybe after 300 seconds

Just my 0.02€

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 9:58 UTC (Mon) by mjthayer (guest, #39183) [Link] (3 responses)

I have asked this a couple of times but not yet got a good answer. I presume that the kernel knows what has been written back and what not. Can't it optionally keep its own log - either in a file on the filesystem or in pre-allocated blocks on a swap device - where it writes details of any transaction which the target filesystem won't write back within a certain maximum timeframe. When the filesystem does do the writeback the transaction can be purged from the log. This could be enabled or disabled for the entire system, regardless of what filesystems are in use, and would not require Ted to add code he doesn't like.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 10:52 UTC (Mon) by MathFox (guest, #6104) [Link] (2 responses)

Michael, Yes, the kernel could do it, but such a log would have to be written to disk... But then it would be more efficient to directly write that log directly to the file system.
You'll create similar issues wrt. performance and commit intervals with a kernel-based log, but with the added overhead of writing data twice.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 10:54 UTC (Mon) by mjthayer (guest, #39183) [Link] (1 responses)

Would that apply even if the blocks for the log were reserved in advance and their location known to the kernel?

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 10:55 UTC (Mon) by mjthayer (guest, #39183) [Link]

I will answer my own question - presumably yes, because the kernel can't assume that the filesystem does a simple block to disk mapping.

Garrett: ext4, application expectations and power management

Posted Mar 18, 2009 17:19 UTC (Wed) by rich0 (guest, #55509) [Link]

I agree with your points. And putting fsyncs all over the place in applications is not very helpful.

My mythtv backend (which does a lot of other stuff as well) used to have lots of problems with ivtv buffer overruns. It turns out that mythtv users a fairly small cache, and when it writes to disk it does an fsync on every write. That means that the disk write cache is almost constantly getting flushed and the ability of the kernel to re-order writes is compromised, which then causes io waiting when the system is busy with other stuff as well.

When I increased the buffer moderately and got rid of the fsync everything worked great. So, if I lose power maybe I might lose an extra 10 seconds of the TV show I was recording. However, before the fix I was getting glitches in the video all the time due to overruns.

The role of the OS should be to allow applications to indicate the sensitivity of data and then the OS should figure out how to balance contention for the disk taking into account this kind of weighting. Applications should not be micro-managing the disk cache - that defeats the ability of the kernel to optimize the cache.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 3:14 UTC (Mon) by Nick (guest, #15060) [Link] (13 responses)

> No, even at that point it might not be important to get the new data to disk right away.
> But it is still important to get either the old or the new data. However ext4 leaves the disk
> contents with something that didn't exist at any point during the changes the process made,
> this is because ext4 did the rename before writing the data.

> The problem is, that there exist no API that guarantee exactly the level of integrity needed in
> many cases. You used to be able to create a file and then rename it on top of an existing file to
> get what you wanted. The change forced you to sync to get the guarantee that you needed, but
> it gave you more than you wanted and was slower because of that.

rename is a metadata operation, which is atomic. If you have not guaranteed the data is on disk
with fsync, then seeing the new file with no data after a crash is one obvious outcome.

And if you rely on btrfs to flush on rename, or ext3 semantics or whatever, then the app is still
broken.

"write, fsync, rename" is the sequence you need for correctness. If you don't need the new data
right away, then defer the fsync,rename part until the point at which you do need it. If you see or
percieve some performance problem with sequence required for correctness, then the answer is
absolutely not to destroy correctness or hope to rely on some undocumented implementation
detail. Raise the issue on lkml, provide details, suggest additional APIs etc.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 3:34 UTC (Mon) by drag (guest, #31333) [Link] (12 responses)

What is so wrong with a file system honoring the order of operations?

I mean if a application does a write then rename, why not wait to commit the rename to disk until after the write is committed?

Nobody is caring if the data is flushed to the drive immediately on a rename; just that the data is on the disk by the time the rename is on the disk. That way if the system crashes then your old copy of the data is still valid.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 4:07 UTC (Mon) by k8to (guest, #15413) [Link]

Because write is not a write to disk, and rename is not a rename to disk. They do occur in order in a perceptual way.

That they do not occur in order on disk is what you would want for the usual case.

This is a situation where the apis should be enhanced so that the application can tell the system what it needs.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 5:02 UTC (Mon) by dlang (guest, #313) [Link] (2 responses)

because providing the ordering that you want would kill performance. it would mean that you could not reorder I/O from the order that the various programs happened to ask for it to something that the storage system can do more efficiently. it would mean that the storage system would (in most cases) not be able to combine separate I/O operations into a smaller number of them.

and as a result, it would also cause the drives to wear out faster as the seek across the entire drive more.

you may think that you want that sort of guarantee, but you really don't. if you did than the 5 second window that ext3 has would be completely unacceptable to you as well.

Partial Ordering and Disk I/O

Posted Mar 16, 2009 13:21 UTC (Mon) by Pc5Y9sbv (guest, #41328) [Link]

I wish someone deeply familiar with file system design would give a detailed answer to this question. I am a computer scientist and software architect but don't have practical experience writing or optimizing general purpose file systems. I would, however, love to see pointers to more detailed reading.

But intuitively, I don't think it is as bad as you state. To honor the POSIX ordering all the way to disk would introduce a partial order on write operations, easily imagined as a queue-like structure comprised of a DAG of requests sequenced by write barrier relationships. Each set of siblings and descendents may be reordered, and this need only be maintained in system RAM and mapped to write barriers in the final queued I/O layer to disk. The kernel I/O scheduling would make some of the ordering decisions in mapping the DAG into a stream with write barriers, and leave the rest up to the disk controller. (Examples of mapping the DAG to the stream include deciding how bands of unordered writes from two different streams would be merged into the same band of the final stream, where that band is a set of writes between two write barriers, versus staggered out at different rates to adjust throughput of different streams.)

The sources of this partial ordering information could be explicit syscall/API extensions for write-barriers, but could also be heuristics for cases like that under discussion: maintain ordering with respect to batches of inode-file content writes and inode-linking metadata writes, and related atomic actions like separate relinks of the same file inode or directory inode. This would cover the broad range of "make file content available under a name" crash-recovery semantics and then some...

Coming from a scientific computing background, I suspect most more complex file writing scenarios, such as shared write access from multiple processes, would already have taken into account more elaborate rollback and recovery strategies for the file content in the case of crashes.

What is so wrong with a file system honoring the order of operations?

Posted Mar 20, 2009 19:06 UTC (Fri) by anton (subscriber, #25547) [Link]

because providing the ordering that you want would kill performance. it would mean that you could not reorder I/O from the order that the various programs happened to ask for it to something that the storage system can do more efficiently. it would mean that the storage system would (in most cases) not be able to combine separate I/O operations into a smaller number of them.
No (to each of these statements). A file system could combine many operations into one large batch, write out the batch in any order and with as few I/O operations as it (or the drive) likes, then commit the whole batch by writing one commit block. That would be efficient. Of course this means that no old block must be overwritten before the commit block is written, but that can be achieved by using a journal or a copy-on-write file system.

And yes, I want that guarantee, I really do, and I don't care if the file system loses 5 seconds or 30 seconds of operations, in case of a crash, but I do care if what it gives me is a state that never logically existed before the crash.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 8:47 UTC (Mon) by Nick (guest, #15060) [Link] (7 responses)

> What is so wrong with a file system honoring the order of operations?

There is nothing wrong with it, what is wrong is an application ignoring the documented
standards and assuming it will "honour" some semantics that they happen to think are
reasonable.

Historically the reason why they don't do this is performance. POSIX as far as I can see encoded
existing semantics in this regard, rather than a case of some particular OS or filesystem
developers making some legal interpretation of the document that goes against the spirit of it.

> I mean if a application does a write then rename, why not wait to commit the rename to
> disk until after the write is committed?

You could, but that's not a trivial thing to do for a lot of filesystems (without resorting to an
fsync), and it would also cost performance for apps that don't want it.

> Nobody is caring if the data is flushed to the drive immediately on a rename; just that the
> data is on the disk by the time the rename is on the disk. That way if the system crashes
> then your old copy of the data is still valid.

The way to do that is with fsync. If some filesystem happens to honour flush on rename, you are
still going to need fsync in order to have a correct and portable app, unfortunately. If you just
want the ordering but not the synchronous write that fsync gives, then you need to propose a
new syscall API for this (which would degenerate to fsync if a particular filesystem can't handle it
nicely).

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 14:25 UTC (Mon) by jamesh (guest, #1159) [Link] (6 responses)

> There is nothing wrong with it, what is wrong is an application ignoring
> the documented standards and assuming it will "honour" some semantics
> that they happen to think are reasonable.

Which standard is the application not honouring? The POSIX standard leaves behaviour over system crashes undefined so they can't rely on that one.

Absent some other standard to define the behaviour on crash, applications are left to assume that the implementation defined behaviour is sane.

Given the POSIX defined behaviour of rename() when the system isn't crashing and real world behaviour of ext3, zfs, etc, having the filesystem attempt to preserve the atomic "old content or new content" behaviour seems desirable.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 14:45 UTC (Mon) by k8to (guest, #15413) [Link] (5 responses)

They are not honouring the requirement for them to express that the data be on the disk when the rename is applied.

That's not wrong. It's just wrong if the application requires that the data be on disk after crash, which is what everyone is bitching about.

In the replace-with-rename pattern, it's wrong.

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 14:59 UTC (Mon) by jamesh (guest, #1159) [Link]

> They are not honouring the requirement for them to express that the data
> be on the disk when the rename is applied.

Right. There doesn't seem to be a way to do this without requiring that the data be written to disk right now. In these cases, the application is fine with delayed writes -- they just want the ordering of the write and the rename to be preserved.

> That's not wrong. It's just wrong if the application requires that the
> data be on disk after crash, which is what everyone is bitching about.

That isn't what the applications require though. The behaviour they are after is for the rename to be recorded only if the associated writes are also recorded.

It is acceptable if the rename is lost by a system crash. What is not acceptable is for the rename to occur but not the write.

If the application wanted to be sure that the data had been flushed, before the rename, then yes they should call fsync().

Garrett: ext4, application expectations and power management

Posted Mar 16, 2009 15:03 UTC (Mon) by drag (guest, #31333) [Link]

> That's not wrong. It's just wrong if the application requires that the data be on disk after crash, which is what everyone is bitching about.

Well they want either the old data or new data to be in a file system after recovering from a crash. Not files full of zeroes...

People are willing to put up with missing X number of seconds of work from the vast majority of applications they are using.

It's actually rare that people want data immediately written to disk. Stuff they want saved very carefully and immediately is generally going to be user-generated data (what your editing with Emacs) and not automatically generated data (my application remembering the position of icons in my windows).

Forcing a commit immediately to disk seems to be a much bigger hammer then what is wanted. They just want to have the OS not to corrupt files if it can be helped.

If fsync() is the only way to have the OS not to randomly blow away files on my hard drive, then so be it. It just seems like there should be a better way.

Garrett: ext4, application expectations and power management

Posted Mar 17, 2009 10:09 UTC (Tue) by malor (guest, #2973) [Link] (2 responses)

What the author is arguing, and I agree with him, is that applications need a method to guarantee that the data on disk is always good, whatever version it is, but without the penalty of a full fsync. That may not matter _that_ much on a server or desktop, but a laptop, that means the drive absolutely has to spin up from sleep, or can't sleep in the first place. This is an substantial battery hit. I don't have any easy way to test it, but hard drive spinups are expensive as hell (and slow), so it wouldn't shock me if this ext4 behavior change singlehandedly wiped out a good chunk of the work done to improve kernel power usage on laptops.

Atomic rename is not the same thing as fsync. Telling application authors that they have to use fsync is yet another example of, when something is hard to do in Linux, telling the user that what he or she wants is wrong and stupid. This pattern goes way, way back.

Once upon a time, in the early days of Linux, I commented on Slashdot that ext2 was a bad filesystem, and would lose data if the computer crashed or lost power. I was informed, by numerous people, that the data loss was my fault because the computer wasn't on a UPS, and that I should 'simply' have manually run a disk editor and restored a backup superblock to recover the corrupted files. Seriously: lost data, they claimed, was my fault because I didn't understand the layout of ext2 well enough to fire up a hex editor when it crashed.

Well, sometime in the next year or two, journaling showed up, and suddenly everyone was all about how wonderful it was, how horrible ext2 was in comparison, and how no sane person would use ext2 in production. But when I'd said that, when there was no other option, I was wrong and stupid for wanting reliability in my filesystem.

I see this argument the same way; by accident, the ext3 writers provided a very useful feature. Atomic rename isn't fsync; it's much lighter weight. People are not wrong and stupid for wanting it, but because it's hard, that's practically the first thing out of people's mouths. "You can't do that on ext4. That's not the POSIX semantics, and you're foolish to expect this behavior."

I disagree vehemently. It's a very good feature, and even if it "isn't the Posix standard", you guys should bring this behavior forward. Doing it via the regular rename operation might be a good choice, because it's backwards-compatible with the original accidental feature. Or, perhaps you'll instead want to add an explicit atomic rename operation, so that filesystems like xfs won't surprise users unpleasantly. That would require more pain on the part of application developers, but would make the guarantee explicit instead of implicit, which is probably better from a design perspective.

But telling people to use fsync instead of atomic rename, and that they're wrong and stupid for wanting a feature that's hard to do, is just a tired repetition of a very old game indeed.

Garrett: ext4, application expectations and power management

Posted Mar 17, 2009 15:37 UTC (Tue) by smoogen (subscriber, #97) [Link] (1 responses)

As far as I can tell... the only way you are going to get what you want is an fsync() or battery backuped cache. Disk drives are limited to writing or reading and are pretty much a 'linear' device in that regards.

In the past, the fsync sort of happened every 5 seconds so you never really spun down your disk. It was the reason why people considered ext3 a slow filesystem compared to xfs, etc etc. One can get better performance, but at the price of reliability.

Garrett: ext4, application expectations and power management

Posted Mar 18, 2009 7:54 UTC (Wed) by malor (guest, #2973) [Link]

It's not the 5-second thing. Rather, something about how ext3 orders writes means that, purely by accident, a rename of a file will always be done after the data blocks of the file have been written to disk. I have no idea why this happens, and it obviously wasn't an intended feature, but that's how it actually works out in practice. The fact that xfs doesn't do this, in fact, is one of the reasons it's considered unreliable by people who've used it on the desktop.

Even if disk spinups were once every five minutes instead of every five seconds, you would still get that behavior; all the data blocks of a given file would be written to disk before that file was renamed over another one.

This means that you're guaranteed to always have either the old data OR the new data. You don't know which you have, after a kernel crash or power failure, but you have one or the other. And this happens without needing to do an fsync, which is a different logical thing, and which absolutely requires a drive spinup. This sync-and-rename functionality is much lighter weight, and can happen pretty much anytime. It doesn't add to the power burden of using the disk, but still guarantees a form of data integrity that many applications find very useful.

Either good old data OR good new data is not the same as fsync. Telling programmers to use fsync is forcing them to use the hammer that's convenient, instead of the screwdriver that would better solve the problem.

Garrett: ext4, application expectations and power management

Posted Mar 17, 2009 15:10 UTC (Tue) by kjp (guest, #39639) [Link]

Wow. This thread is soaking up a huge portion of my work day. I posted this on ted's blog:

>> in order to get very high performance levels, file systems usually combine 5-30 seconds worth of >> file system operations into a single transaction commit.

Ted, thanks for continuing the dialogue here. It’s been educational. Thanks for putting the rename ‘kludge’ in, but I do think it’s absolutely necessary. I will give you my use case.

I use the write-close-rename all the time with configuration files that I don’t care if I lose the last N seconds of change to. This is easy and thread safe and has worked for years on ext3. At the same time, I have a daemon that is continually writing out large log files. We also run on cheap IDE hardware due to cost pressures. It seems that fsyncs will force your above described ‘mega transaction’ to complete, which involves seeking all over the disk to our other dirty log files. If I made multiple changes to a conf file during a 5-30 second ext4 flush interval, fsync will cause more seeks than not using it, which will wear out our disks.

You are right that the FS is not a database, certain things I do not care about instant durability. So we can:
1. have the FS support write barriers on rename or other new api
2. do writes to a user space cache daemon that only flushes when necessary
3. make every app much more complicated and cache its own data

#1 already works.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds