|
|
Subscribe / Log in / New account

The NOVA filesystem

Did you know...?

LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

By Jake Edge
May 18, 2018
LSFMM

At the 2018 Linux Storage, Filesystem, and Memory-Management Summit, Andiry Xu presented the NOVA filesystem, which he is trying to get into the upstream kernel. Unlike existing kernel filesystems, NOVA exclusively targets non-volatile main memory (NVMM) rather than traditional block devices (disks or SSDs). In fact, it does not use the kernel's block layer at all and instead uses persistent memory mapped directly into the kernel address space.

Xu compared NOVA to versions of the ext4 and XFS filesystems with support for the DAX direct-access mechanism. With those, only the filesystem data bypasses the page cache; the metadata still goes through the page cache. In addition, those filesystems have a much higher latency for append operations. There is also a write amplification effect. All of that makes for high journaling overhead, he said.

[Andiry Xu]

Beyond that, there are scalability issues for those filesystems on NVMM. He ran some tests on high-end multicore hardware to compare NOVA and tmpfs to the DAX modes of ext4 and XFS. In his tests, he emulated NVMM with RAM, since it is difficult to actually get NVMM devices at this point. In general, only tmpfs and NOVA scale reasonably—the other filesystems contend for various locks and semaphores—though there is still room for NOVA to improve as only tmpfs scaled reasonably for one of the tests.

Support for huge pages is difficult for DAX filesystems, Xu said. Huge pages require that the physical address is aligned on a huge-page-size boundary and that the memory is physically contiguous, but memory allocated by filesystems does not necessarily conform to those requirements. Dave Chinner said that XFS has an inode option to support huge-page use; another attendee said that ext4 has an analogous feature but it can only support 2MB huge pages, not 1GB.

Xu pointed attendees at the 2016 NOVA paper [PDF] for more information, but gave a quick overview of some of NOVA's features. It is a log-structured filesystem that is designed for NVMM. It has per-inode logging that contains only the metadata changes; the log points off to changes to the actual data. It uses a radix tree for block mappings and is copy on write (CoW) for its file data.

NOVA uses a lightweight journaling scheme that simply records the head and tail pointers for a linked list of log entries in the journal. That leads to fast garbage collection as entries are dropped from the list when they are no longer valid. There is no copying unless invalid entries make up more than half of the log, in which case a new log is created to atomically replace the old one; the metadata log entries are only copied at that point.

He showed some performance graphs comparing the DAX versions of ext4 and XFS with NOVA. Generally, NOVA performs better than either ext4 or XFS on most filebench workloads that he tested. The exception is the "web server" workload where the filesystems all performed roughly the same.

Xu said that a second RFC posting that was based on 4.16-rc4 was done in March. That post received some feedback, so he is working on those items and will be posting a v3 soon. The changes needed include 64-bit timestamps and better huge-page support.

Chinner asked about user-space tools and, in particular, whether there was an fsck for NOVA. That will be needed before the filesystem can be merged as users will need to be able to repair their filesystems. Xu said there has been a focus on performance, so there is no fsck yet. Ted Ts'o noted that NOVA also needs a tool that can verify filesystem images, which will allow more tests in xfstests to be run on it.


Index entries for this article
KernelFilesystems/Nonvolatile memory
ConferenceStorage Filesystem & Memory Management/2018


(Log in to post comments)

The NOVA filesystem

Posted May 18, 2018 21:57 UTC (Fri) by jhoblitt (subscriber, #77733) [Link]

Wasn't btrfs merged without a fsck?

The NOVA filesystem

Posted May 18, 2018 23:17 UTC (Fri) by naptastic (guest, #60139) [Link]

I think the reason that a working fsck is required now, is because btrfs was merged without a fsck.

The NOVA filesystem

Posted May 18, 2018 23:26 UTC (Fri) by willy (subscriber, #9762) [Link]

btrfsck was created in April 2007. Btrfs was merged in 2009.

The NOVA filesystem

Posted May 19, 2018 1:37 UTC (Sat) by jhoblitt (subscriber, #77733) [Link]

I can't find a related commit in btrfs-progs from before 2013. Could you provide a reference?

The NOVA filesystem

Posted May 19, 2018 2:25 UTC (Sat) by willy (subscriber, #9762) [Link]

The NOVA filesystem

Posted May 19, 2018 3:13 UTC (Sat) by sfeam (subscriber, #2841) [Link]

https://lwn.net/Articles/462543/

The NOVA filesystem

Posted May 19, 2018 14:02 UTC (Sat) by jhoblitt (subscriber, #77733) [Link]

I vaguely member asking Chris Mason about said tool at a SCALE talk in the early 201Xs. There was no fsck at that time and it definitely was not "in the wild" when btrfs was merged.

The NOVA filesystem

Posted May 21, 2018 18:01 UTC (Mon) by kdave (subscriber, #44472) [Link]

That was probably late 2011/early 2012, there was a lot of buzz about btrfsck. With the announced enterprise support by SUSE and Oracle the question about fsck was frequent. There was fsck code in git and admittedly missing some features, but not really non-existent. In February 2012, some important repair features landed and since then lot of improvements have been added.

What is NVMM ?

Posted May 19, 2018 10:30 UTC (Sat) by ju3Ceemi (subscriber, #102464) [Link]

What is NVMM ?
I do not find any thing about that

What is the differences compared to NVMe ?

What is NVMM ?

Posted May 19, 2018 13:11 UTC (Sat) by jake (editor, #205) [Link]

> What is NVMM ?

From the article:

> NOVA exclusively targets non-volatile main memory (NVMM) rather than traditional block devices (disks or SSDs).

jake

What is NVMM ?

Posted May 19, 2018 13:53 UTC (Sat) by ju3Ceemi (subscriber, #102464) [Link]

Sure, but that does not give any explaination about the technology by itself
I guess that Andiry Xu is working on that because there is a specific upcoming hardware, yet I do not find anything about it

Besides, I do not understand the differences between this hardware and NVMe's : or maybe NOVA is targeting NVMe devices as well ?

What is NVMM ?

Posted May 20, 2018 14:41 UTC (Sun) by musicinmybrain (subscriber, #42780) [Link]

I think this work is primarily targeting NVDIMMs (non-volatile DIMMs). Most (all?) of the products on the market in this category currently use conventional DRAM that is hastily backed up to a Flash chip when the module loses power, much like the flash-backed write caches on certain RAID controllers. You can buy these NVDIMMs from major server vendors already. The obvious next step is to build them with an inherently nonvolatile technology like 3D XPoint.

What is NVMM ?

Posted May 19, 2018 14:43 UTC (Sat) by tux3 (subscriber, #101245) [Link]

Maybe it means something like ReRAM or 3D XPoint/Optane, as opposed to the good old NAND flash memory in NVMe SSDs?

What is NVMM ?

Posted May 19, 2018 19:24 UTC (Sat) by luzh (guest, #112103) [Link]

It means byte-addressable, cache-coherent, non-volatile memory operating on the main memory bus like DRAM. Now a more commonly used name for this memory is "persistent memory" or "pmem". Relevant technologies are 3DXpoint, PCM, STT-RAM etc.

What is NVMM ?

Posted May 19, 2018 19:40 UTC (Sat) by ju3Ceemi (subscriber, #102464) [Link]

Thank you sir for the kind details

The NOVA filesystem

Posted May 24, 2018 16:52 UTC (Thu) by lpremoli (guest, #94065) [Link]

Hi, does anybody know if the slides presented at LSFMM are available? Or if I can find anywhere the performance numbers?


Copyright © 2018, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds