<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
>

  <channel rdf:about="http://lwn.net/headlines/226351/">
    <title>LWN: Comments on "The 2007 Linux Storage and File Systems Workshop"</title>
    <link>http://lwn.net/Articles/226351/</link>
    <description>
This is a special feed containing comments posted
to the individual LWN article titled &quot;The 2007 Linux Storage and File Systems Workshop&quot;.

    </description>

    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>2</syn:updateFrequency>
    <items>
      <rdf:Seq>
	<rdf:li resource="http://lwn.net/Articles/230791/rss" />
	<rdf:li resource="http://lwn.net/Articles/228545/rss" />
	<rdf:li resource="http://lwn.net/Articles/227676/rss" />
	<rdf:li resource="http://lwn.net/Articles/227671/rss" />
	<rdf:li resource="http://lwn.net/Articles/227588/rss" />
	<rdf:li resource="http://lwn.net/Articles/227587/rss" />
	<rdf:li resource="http://lwn.net/Articles/227585/rss" />
	<rdf:li resource="http://lwn.net/Articles/227582/rss" />
	<rdf:li resource="http://lwn.net/Articles/227581/rss" />
	<rdf:li resource="http://lwn.net/Articles/227574/rss" />
	<rdf:li resource="http://lwn.net/Articles/227572/rss" />
	<rdf:li resource="http://lwn.net/Articles/227537/rss" />
	<rdf:li resource="http://lwn.net/Articles/227520/rss" />
	<rdf:li resource="http://lwn.net/Articles/227483/rss" />
	<rdf:li resource="http://lwn.net/Articles/227469/rss" />
	<rdf:li resource="http://lwn.net/Articles/227464/rss" />
	<rdf:li resource="http://lwn.net/Articles/227408/rss" />
	<rdf:li resource="http://lwn.net/Articles/227402/rss" />
	<rdf:li resource="http://lwn.net/Articles/227351/rss" />
	<rdf:li resource="http://lwn.net/Articles/227299/rss" />
	<rdf:li resource="http://lwn.net/Articles/227293/rss" />
	<rdf:li resource="http://lwn.net/Articles/227259/rss" />
	<rdf:li resource="http://lwn.net/Articles/227255/rss" />
	<rdf:li resource="http://lwn.net/Articles/227249/rss" />
	<rdf:li resource="http://lwn.net/Articles/227221/rss" />
	<rdf:li resource="http://lwn.net/Articles/227197/rss" />
	<rdf:li resource="http://lwn.net/Articles/227175/rss" />
	<rdf:li resource="http://lwn.net/Articles/227088/rss" />
	<rdf:li resource="http://lwn.net/Articles/227082/rss" />
	<rdf:li resource="http://lwn.net/Articles/227053/rss" />
	<rdf:li resource="http://lwn.net/Articles/227050/rss" />
	<rdf:li resource="http://lwn.net/Articles/227029/rss" />
	<rdf:li resource="http://lwn.net/Articles/227031/rss" />
	<rdf:li resource="http://lwn.net/Articles/227026/rss" />
	<rdf:li resource="http://lwn.net/Articles/227002/rss" />
	<rdf:li resource="http://lwn.net/Articles/227000/rss" />
	<rdf:li resource="http://lwn.net/Articles/226999/rss" />
	<rdf:li resource="http://lwn.net/Articles/226996/rss" />
	<rdf:li resource="http://lwn.net/Articles/226991/rss" />
	<rdf:li resource="http://lwn.net/Articles/226974/rss" />
      
      </rdf:Seq>
    </items>

  </channel>
    <item rdf:about="http://lwn.net/Articles/230791/rss">
      <title>fsck / xfs - versus ZFS</title>
      <link>http://lwn.net/Articles/230791/rss</link>
      <dc:date>2007-04-17T14:06:11+00:00</dc:date>
      <dc:creator>qu1j0t3</dc:creator>
      <description>
      It would be wrong to assume ZFS has the same failure modes as XFS.

See, for instance: &lt;a href=&quot;http://blogs.sun.com/bill/entry/zfs_and_the_all_singing&quot;&gt;Bill Moore's blog&lt;/a&gt;.
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/228545/rss">
      <title>Gee that's tough.</title>
      <link>http://lwn.net/Articles/228545/rss</link>
      <dc:date>2007-03-31T00:57:57+00:00</dc:date>
      <dc:creator>wmf</dc:creator>
      <description>
      But ZFS doesn't use compare-by-hash; that's one of its advantages over previous work like Venti/&lt;br&gt;
Fossil.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227676/rss">
      <title>NVCache for journal</title>
      <link>http://lwn.net/Articles/227676/rss</link>
      <dc:date>2007-03-25T02:20:38+00:00</dc:date>
      <dc:creator>sweikart</dc:creator>
      <description>
      &lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; first, every write on the system would have to go through the&lt;/font&gt;&lt;br&gt;
&lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; flash, so longest life access pattern or not, it may still have&lt;/font&gt;&lt;br&gt;
&lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; problems.&lt;/font&gt;&lt;br&gt;
&lt;p&gt;
Actually, for most of us journal users, only the metadata writes&lt;br&gt;
go into the journal.&lt;br&gt;
&lt;p&gt;
&lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; ... is that the most efficient method to record an atime update&lt;/font&gt;&lt;br&gt;
&lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; in a journal?&lt;/font&gt;&lt;br&gt;
&lt;p&gt;
You raise a good point.  As a system administrator, I don't mount&lt;br&gt;
with noatime, because atime is too useful.  But, atime is not so&lt;br&gt;
critical that I need it to be journaled; so, I'd love to mount&lt;br&gt;
nojournaledatime.&lt;br&gt;
&lt;p&gt;
-scott&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227671/rss">
      <title>NVCache for journal</title>
      <link>http://lwn.net/Articles/227671/rss</link>
      <dc:date>2007-03-25T01:48:01+00:00</dc:date>
      <dc:creator>dlang</dc:creator>
      <description>
      useing flash for a journal would have some headaches (potentially fixable)&lt;br&gt;
&lt;p&gt;
first, every write on the system would have to go through the flash, so longest life access pattern or not, it may still have problems.&lt;br&gt;
&lt;p&gt;
second, when the write is complete the system needs to go back to the hjournal (flash) and mark it as being completed&lt;br&gt;
&lt;p&gt;
third, the chunks of data going into the journal are of many different sizes (yes, in the end it all gets down to writing fixed size disk blocks, but is that the most efficant method to record an atime update in a journal? probably not)&lt;br&gt;
&lt;p&gt;
good drivers and layouts that are flash aware may be abel to address these, if the API gives them sufficiant control (for example, since flash can be chagned from 0 to 1 without eraseing, make sure the 'entry commited flag is 1 when complete and you can just flip that bit, in theory)&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227588/rss">
      <title>DualFS</title>
      <link>http://lwn.net/Articles/227588/rss</link>
      <dc:date>2007-03-23T22:04:25+00:00</dc:date>
      <dc:creator>giraffedata</dc:creator>
      <description>
      &lt;blockquote&gt;
DualFS is a file system by Juan Piernas that separates data and meta data into separate file systems.
&lt;/blockquote&gt;
&lt;p&gt;
That's separate devices, not separate file systems.  A single DualFS file system has both metadata and file data, and the innovation is that they are stored in such a way that access to one doesn't interfere with access to the other.
&lt;p&gt;
And it's worth noting that while ideally these devices would not share a head, the experiments reported are done with the devices being partitions of a single physical device and &lt;em&gt;still&lt;/em&gt; show improvement.

      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227587/rss">
      <title>The storage systems contract</title>
      <link>http://lwn.net/Articles/227587/rss</link>
      <dc:date>2007-03-23T21:56:49+00:00</dc:date>
      <dc:creator>giraffedata</dc:creator>
      <description>
      &lt;blockquote&gt;
Storage systems have a simple and important contract to keep: given user data they must save that data to disk without loss or corruption even in the face of system crashes.
&lt;/blockquote&gt;
&lt;p&gt;
s/to disk//
&lt;p&gt;
The fact that there's a disk in there, if there is one, is none of the other party's business.  That's why clauses in that contract specifying what things such as &quot;fsync&quot; mean are so ambiguous.
&lt;p&gt;
The contract is actually quite complex in the area of what data is allowed to be lost in the face of a system crash.

      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227585/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227585/rss</link>
      <dc:date>2007-03-23T21:36:45+00:00</dc:date>
      <dc:creator>vmole</dc:creator>
      <description>
      &amp;lt;p&amp;gt;Oh, duh, right. I knew that, at one time in the distant past. Sorry for the noise.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227582/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227582/rss</link>
      <dc:date>2007-03-23T21:34:26+00:00</dc:date>
      <dc:creator>giraffedata</dc:creator>
      <description>
      &lt;blockquote&gt;
... and you can read at specific offsets (which is what readv(2) gives you), isn't that being able to read specific blocks?
&lt;/blockquote&gt;
&lt;p&gt;
You need to be able to read not just specific offsets, but specific discontiguous offsets.  readv() does not give you that.  You have to read a contiguous area of the file (block device).  You cannot make it read from 4K to 8K and 12K to 16K.
&lt;p&gt;
What readv() adds to read() is that you can read that contiguous file region into discontiguous memory, whereas with read() it has to go into a single contiguous range of memory addresses.

      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227581/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227581/rss</link>
      <dc:date>2007-03-23T21:28:50+00:00</dc:date>
      <dc:creator>zlynx</dc:creator>
      <description>
      I am going on the man page for readv.&lt;br&gt;
&lt;p&gt;
ssize_t readv(int fd, const struct iovec *vector, int count);&lt;br&gt;
struct iovec {&lt;br&gt;
    void *iov_base;   /* Starting address */&lt;br&gt;
    size_t iov_len;   /* Number of bytes */&lt;br&gt;
};&lt;br&gt;
&lt;p&gt;
iov_base and iov_len apply to RAM addresses, not disk block or character addresses.  The descriptor &quot;fd&quot; is read linearly for &quot;count&quot; iovec structures.&lt;br&gt;
&lt;p&gt;
So, readv does *not* give you the ability to read at specific offsets.  As I understand it from reading the documentation.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227574/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227574/rss</link>
      <dc:date>2007-03-23T20:22:08+00:00</dc:date>
      <dc:creator>vmole</dc:creator>
      <description>
      &lt;p&gt;Please explain further (I'm not doubting you, just trying to understand). If you open a block device (e.g. /dev/hda1), and you can read at specific offsets (which is what readv(2) gives you), isn't that being able to read specific blocks? I mean, I understand that it's not the underlying physical disk blocks, but isn't the mapping between physical disk blocks and the logical blocks that the filesystem sees all handled in hardware?
&lt;p&gt;A very quick glance at the e2fsprogs source indeed seems to use open(&quot;/dev/hda1&quot;) and read/write.
&lt;p&gt;Where's Mr. Tso when we need him?
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227572/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227572/rss</link>
      <dc:date>2007-03-23T20:05:49+00:00</dc:date>
      <dc:creator>zlynx</dc:creator>
      <description>
      readv and writev both use a vector of memory buffers, but they are not for writing a vector of disk blocks.&lt;br&gt;
&lt;p&gt;
We probably need readiov/writeiov and readviov/writeviov or something like that.&lt;br&gt;
&lt;p&gt;
I also had a crazy idea just now.  What if they used device mapper to create a dm device with a linear view of every block fsck needed to read?  Let readahead run on that.&lt;br&gt;
&lt;p&gt;
How about readahead(2) or fadvise or posix_fadvise?  &lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227537/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227537/rss</link>
      <dc:date>2007-03-23T16:55:23+00:00</dc:date>
      <dc:creator>vmole</dc:creator>
      <description>
      &lt;p&gt;Yes, that's what I meant by &quot;RAW_IO on the partition&quot; :-) But, again, if that would work, then readv(2) would seem to provide the required vectorized API, so why are the smart people saying they need a new API?

      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227520/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227520/rss</link>
      <dc:date>2007-03-23T16:26:42+00:00</dc:date>
      <dc:creator>nix</dc:creator>
      <description>
      I can't see how you can eliminate opendir()/readdir(), but telldir() and &lt;br&gt;
the wierdness around it could be zapped quite easily.&lt;br&gt;
&lt;p&gt;
Outside of scripting language interfaces, the Linux kernel itself, and &lt;br&gt;
glibc, I see hardly any uses of telldir() on a modern Linux box (Ruby, &lt;br&gt;
Perl &amp;amp;c scripts aren't likely to depend on the detailed semantics of &lt;br&gt;
telldir() anyway because non-POSIX systems don't implement them). The &lt;br&gt;
Midnight Commander VFS implements a telldir operation but never appears to &lt;br&gt;
call it...&lt;br&gt;
&lt;p&gt;
... in fact, on my system here, I have no actual *uses* of telldir() at &lt;br&gt;
all. Even strfry() is called more often.&lt;br&gt;
&lt;p&gt;
I'd say telldir()'s semantics could be changed pretty easily. Something &lt;br&gt;
like it might remain potentially useful, but its (extremely annoying to &lt;br&gt;
implement) current semantics don't seem to matter to most real code.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227483/rss">
      <title>NVCache for journal</title>
      <link>http://lwn.net/Articles/227483/rss</link>
      <dc:date>2007-03-23T07:41:14+00:00</dc:date>
      <dc:creator>xanni</dc:creator>
      <description>
      This looks like it could potentially be a really big win for DualFS in particular!&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227469/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227469/rss</link>
      <dc:date>2007-03-23T04:14:09+00:00</dc:date>
      <dc:creator>ldo</dc:creator>
      <description>
      &lt;P&gt;&lt;BLOCKQUOTE&gt;&lt;FONT STYLE=&quot;color : #A04000&quot;&gt;Because fsck needs block addressed access, and aio_read() is based on (fd, offset, count).&lt;/FONT&gt;&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Hint:
&lt;BLOCKQUOTE&gt;&lt;TT&gt;fd = open(&quot;/dev/sda&quot;, ...)&lt;/TT&gt;
&lt;/BLOCKQUOTE&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227464/rss">
      <title>NVCache for journal</title>
      <link>http://lwn.net/Articles/227464/rss</link>
      <dc:date>2007-03-23T03:56:49+00:00</dc:date>
      <dc:creator>sweikart</dc:creator>
      <description>
      &lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; Also, a file system can use the NVCache to store its journal ...&lt;/font&gt;&lt;br&gt;
&lt;p&gt;
This seems like a great approach.  If the NVCache could appear as&lt;br&gt;
a block device that could be partitioned, filesystems that use jbd&lt;br&gt;
(like ext3) could use it right away (a smaller NVCache partition for&lt;br&gt;
read-mostly filesystems like /usr, a larger NVCache partition for&lt;br&gt;
write-mostly filesystems like /var).&lt;br&gt;
&lt;p&gt;
I assume journals are treated as ring buffers for writing, which&lt;br&gt;
is the right access pattern for prolonging the life of flash.&lt;br&gt;
&lt;p&gt;
-scott&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227408/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227408/rss</link>
      <dc:date>2007-03-22T18:48:30+00:00</dc:date>
      <dc:creator>thedevil</dc:creator>
      <description>
      &lt;font class=&quot;QuotedText&quot;&gt;&amp;gt;&amp;gt;Ted Ts'o suggested that someone should try to go through committee to get telldir/seekdir/readdir fixed or eliminated&amp;lt;&amp;lt;&lt;/font&gt;&lt;br&gt;
&lt;p&gt;
So, if committee (POSIX I suppose) goes along with that, how are we going to scan directories?&lt;br&gt;
&lt;p&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227402/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227402/rss</link>
      <dc:date>2007-03-22T18:35:45+00:00</dc:date>
      <dc:creator>vmole</dc:creator>
      <description>
      &lt;p&gt;Because fsck needs block addressed access, and aio_read() is based on (fd, offset, count). Otherwise, you don't even need AIO; you'd just use readv(2). OTOH, it seems like readv() would be sufficient given the RAW_IO to the partition...so maybe someone else should answer this :-)
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227351/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227351/rss</link>
      <dc:date>2007-03-22T17:08:02+00:00</dc:date>
      <dc:creator>jwb</dc:creator>
      <description>
      What do you mean by Ext4 integration?  Lustre already includes all Ext4 features and then some.  In fact you might say that Ext4 is just rolling features into the mainline kernel that have long been used in Lustre.  mballoc, delalloc, and extents have all been in Lustre for years.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227299/rss">
      <title>What ext4 really needs is ...</title>
      <link>http://lwn.net/Articles/227299/rss</link>
      <dc:date>2007-03-22T15:16:46+00:00</dc:date>
      <dc:creator>jospoortvliet</dc:creator>
      <description>
      A /me too from here. I hate file corruption, esp if it goes unnoticed... Having some redundancy in the filesystem (reiser4 promised to bring that, zfs as well) would be great as well, to allow users to recover bad files.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227293/rss">
      <title>Vectorized Read?</title>
      <link>http://lwn.net/Articles/227293/rss</link>
      <dc:date>2007-03-22T15:02:28+00:00</dc:date>
      <dc:creator>jospoortvliet</dc:creator>
      <description>
      Wouldn't this increase latency? And doesn't the kernel already do this, in a limited way?&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227259/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227259/rss</link>
      <dc:date>2007-03-22T11:56:05+00:00</dc:date>
      <dc:creator>nix</dc:creator>
      <description>
      Of course Coda itself was an enhancement of AFS (losing most of its scalability in the process, though)...&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227255/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227255/rss</link>
      <dc:date>2007-03-22T11:53:37+00:00</dc:date>
      <dc:creator>nix</dc:creator>
      <description>
      The page cache is aware of blocksizes differing from PAGE_SIZE, which provides a lot of what's needed, but that code is complex and delicate, and extending it to allow pieces of multiple files to co-exist in a single page-cache page is quite unlikely to be done (the memory savings are, after all, marginal, at half a page per file, and the complexity increase is significant).&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227249/rss">
      <title>fsck / xfs</title>
      <link>http://lwn.net/Articles/227249/rss</link>
      <dc:date>2007-03-22T11:47:46+00:00</dc:date>
      <dc:creator>wookey</dc:creator>
      <description>
      I too have found the hard way that yanking the power on XFS (or just hitting reset at a bad time) is a very bad idea. All the files that had pending writes just end up as the correct length of zeros. When this is includes your package database, perl binaries and a load of other libs, this is quite bad.
&lt;p&gt;
 The xfs_repair tool did do a pretty-good repair job (once I fixed it so it ran! &lt;a href=&quot;http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=414079&quot;&gt;http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=414079&lt;/a&gt;) but it did take about 5 hours to do it on a pair of 200GB mirrored drives. Then I got to re-install everything to fix the damage.

&lt;p&gt; About 3 days faff in total. Fair dues though - there was no user-data loss and the system was recoverable, but I've never had this trouble with reiser3 on my laptop or ext3 on other boxes. So, yes, XFS is a really nice filesystem (live resizing, nice and fast) but I'd avoid it unless there is a UPS around. 
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227221/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227221/rss</link>
      <dc:date>2007-03-22T10:20:23+00:00</dc:date>
      <dc:creator>wingo</dc:creator>
      <description>
      This would be a nice topic to revisit in an article -- networked filesystems. Samba for unix&amp;lt;-&amp;gt;unix is something I'd like to know more about.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227197/rss">
      <title>Gee that's tough.</title>
      <link>http://lwn.net/Articles/227197/rss</link>
      <dc:date>2007-03-22T05:55:29+00:00</dc:date>
      <dc:creator>snitm</dc:creator>
      <description>
      LVM2 Snapshots are quite bad.  For starters they are done at the block-level whereas ZFS provides file-level snapshots (aka redirect on write).  LVM2 snapshots don't scale well either; seeing as each snapshot imposes a copy out penalty because there isn't a shared exception store (aka LVM snapshot LV) for all snapshots of an origin LV.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227175/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227175/rss</link>
      <dc:date>2007-03-22T02:01:26+00:00</dc:date>
      <dc:creator>drag</dc:creator>
      <description>
      I don't take them too seriously. :)&lt;br&gt;
&lt;p&gt;
But they aren't to far off. One thing worth noting is that Ext4 integration isn't mentioned anywere on them, but it's obvious that Ext4 is going to play a large part in it.&lt;br&gt;
&lt;p&gt;
One thing about CFS, which I think is important to keep in mind, is that they  are decendents of the failed Coda and then the Intermezzo projects. I don't know the exact relationships, but I think that they were developers in those projects.&lt;br&gt;
&lt;p&gt;
The thing is is that they learned the hard way that distributed network file system protocols aren't a easy thing to make, even if you are good at it. It takes a lot of time and effort to get anything going and a long time of development to get to the point were you can actually release anything.&lt;br&gt;
&lt;p&gt;
So it's not something that lends itself to the Linux-style development proccess of 'release early', 'release often'.&lt;br&gt;
&lt;p&gt;
So they formed CFS to pursue the money nessicary to support themselves while they hacked on Lustre full time. The HPC market is the easiest and most profitable place to target for this sort of stuff, and they know that from Beowolf stuff that open source and distributed computing can lead to dramatic results.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227088/rss">
      <title>Persistant allocation without zeroing :== bug</title>
      <link>http://lwn.net/Articles/227088/rss</link>
      <dc:date>2007-03-21T17:40:25+00:00</dc:date>
      <dc:creator>davecb</dc:creator>
      <description>
        To be fair, I should comment that preallocation of disk blocks&lt;br&gt;
for write is a **good** idea, and one which the Samba team,&lt;br&gt;
amoung others, would welcome.  &lt;br&gt;
  The &quot;impedance mismatch&quot; between FAT and NTFS on one&lt;br&gt;
hand and Unix file systems on the other has caused lots of &lt;br&gt;
problems wen Unix servers return an out of space indication&lt;br&gt;
in circumstances where Windows clients think an error can't&lt;br&gt;
happen (;-))&lt;br&gt;
  Of course, if the preallocated space isn't written to, I'd hope&lt;br&gt;
the blocks would be zeroed on close, and previous to the close,&lt;br&gt;
would not be avilable to be read. The latter introduces an impedance&lt;br&gt;
mismatch with Unix files opened fro both write and read...&lt;br&gt;
&lt;p&gt;
--dave&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227082/rss">
      <title>Persistant allocation without zeroing :== bug</title>
      <link>http://lwn.net/Articles/227082/rss</link>
      <dc:date>2007-03-21T17:19:48+00:00</dc:date>
      <dc:creator>davecb</dc:creator>
      <description>
      Oh quite: that's what the GCOS programs was &lt;br&gt;
intended to do.  It just didn't succeed.&lt;br&gt;
&lt;p&gt;
--dave&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227053/rss">
      <title>Gee that's tough.</title>
      <link>http://lwn.net/Articles/227053/rss</link>
      <dc:date>2007-03-21T16:06:04+00:00</dc:date>
      <dc:creator>bronson</dc:creator>
      <description>
      For snapshotting, just run whatever filesystem you want on top of LVM.  LVM is a little hard to get used to at first but it's definitely worth the effort.  Here are some notes I took when setting it up on my systems: &lt;a href=&quot;http://wiki.u32.net/LVM&quot;&gt;http://wiki.u32.net/LVM&lt;/a&gt;&lt;br&gt;
&lt;p&gt;
I now put all nontrivial partitions on LVM.  Works for me.  I'll let others argue whether LVM snapshots are worse than ZFS, or if ZFS is a layering violation.  :)&lt;br&gt;
&lt;p&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227050/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227050/rss</link>
      <dc:date>2007-03-21T15:53:00+00:00</dc:date>
      <dc:creator>k8to</dc:creator>
      <description>
      You've mentioned this before in response to my discussing of issuses encountered trying to use NFS for similar purposes.&lt;br&gt;
&lt;p&gt;
My response is still that sshfs is a great thing, and pretty useful for trivial tasks, or remote manipulation of low-bandiwidth, high-latency small set sof files.  It's much easier to edit some remote configuration thing with a local tool via sshfs than most anything else. &lt;br&gt;
&lt;p&gt;
But sshfs still can't handle a variety of normal file sharing activities reasonably.  It fails entirely on mmap and large files make it choke because it hasn't got sufficient cache sophistication.  Over a LAN you'll never get 10% of your throughput while maxing your CPUs on the ciphers.  If the ssh link actually goes down (this happens), the whole thing gets very unhappy and it is impossible to recover.&lt;br&gt;
&lt;p&gt;
Basically the only thing that makes sshfs &quot;better&quot; than the traditional lousy network filesystems we love to hate is that it has a well defined focus.  It's a no-server-configuration filesystem for accessing small numbers of smallish files without high performance expectations.  It is a remarkably pleasant tool when used inside its scope, but one of the reasons it is pleasant is it has a much narrower scope.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227029/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227029/rss</link>
      <dc:date>2007-03-21T13:47:24+00:00</dc:date>
      <dc:creator>saffroy</dc:creator>
      <description>
      It's nice that you mention Lustre, actually I was kind of surprised that it would not be mentioned in this article. Lustre definitely has a great potential (great scalability, sequential I/O performance, client cache, excellent POSIX conformance), and I feel it could be a good general purpose global fs someday.&lt;br&gt;
&lt;p&gt;
That is, if its creators (CFS) let it grow out of its niche HPC market: at the moment, I feel they're more concerned about implementing the features asked by their paying customers, which are big supercomputing centers. I'm certainly not blamining them for that, but for instance, they are more sensitive to large file throughput (tens of GB/s) than to file creation rates (Lustre is still damn slow here).&lt;br&gt;
&lt;p&gt;
If the community or the customers push in the right direction, Lustre can become an excellent distributed fs for nearly everyone, but I feel it has yet to happen -- and I hope it will.&lt;br&gt;
&lt;p&gt;
Oh, and don't take CFS roadmaps too seriously. ;-)&lt;br&gt;
&lt;p&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227031/rss">
      <title>Gee that's tough.</title>
      <link>http://lwn.net/Articles/227031/rss</link>
      <dc:date>2007-03-21T13:40:00+00:00</dc:date>
      <dc:creator>mennucc1</dc:creator>
      <description>
      &lt;font class=&quot;QuotedText&quot;&gt;&amp;gt; Most of the features of ZFS are aviable on Linux right now.&lt;/font&gt;&lt;br&gt;
&lt;p&gt;
but still I would like to have snapshots in EXT (maybe in 5 :-) ?)&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227026/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227026/rss</link>
      <dc:date>2007-03-21T12:25:46+00:00</dc:date>
      <dc:creator>nix</dc:creator>
      <description>
      Of course, you can't usefully store named pipes or devices on NFS-shared filesystems, either...&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227002/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227002/rss</link>
      <dc:date>2007-03-21T02:54:33+00:00</dc:date>
      <dc:creator>drag</dc:creator>
      <description>
      OpenAFS is, indeed, very nice.&lt;br&gt;
&lt;p&gt;
But it's Windows support is crap. Not because OpenAFS is not cool, but because Window uses SMB or Microsoft's DFS to do it's thing and nobody else's. OpenAFS has to use a sort of SMB emulation were it deals with AFS stuff then translates that to something that the system can use.&lt;br&gt;
&lt;p&gt;
But if your just dealing with Linux clients then that's not a problem.&lt;br&gt;
&lt;p&gt;
Also the file and directory permission model is bizzare and isn't realy compatable with just standard Unix-style ACL (user/group/world read/write/execute) model. So people used to Linux permissions have to relearn how to deal with AFS permissions.&lt;br&gt;
&lt;p&gt;
It's not posix, and it's not compatable with special file types like named pipes.&lt;br&gt;
&lt;p&gt;
Also there is no real way to access your data unless your AFS server stuff is actually running. OpenAFS tends to incure a higher amount of knowledge and administration stuff isn't very easy to deal with.&lt;br&gt;
&lt;p&gt;
Then it's large file performance is realy bad.  It's just plain slow and the volumes are very limited in size.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
&lt;p&gt;
What it's VERY good for is if you have a large distributed network. &lt;br&gt;
&lt;p&gt;
Say you have a wireless network or a WAN-wide thing were you have a entire campus of computers to take care off. It handles disconnection very well, it's caching support is very good for semi-offline work (ie you can still edit a file even if you temporarially lost contact with the servers.&lt;br&gt;
&lt;p&gt;
It's security stuff is nice. The volume management is very nice, snapshotting and mirroring stuff. It's safe to use over the internet and unencrypted wireless networks.&lt;br&gt;
&lt;p&gt;
And as a special bonus it's /afs/ directory tree is very handy. It allows people to move volumes around, change servers, setup mirrors, and all sorts of stuff without having to have the clients know of any of these changes. Were as with NFS or SAMBA if you change out file servers or whatnot then the clients all have to be reconfigured to know the new locations and names of the servers and directories. With OpenAFS this is not nessicary.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
&lt;p&gt;
But considuring the lack of posix support and poor large file performance as well as permission issues it's not realy a replacement for NFS. It's a alternative that is usefull in places were NFS is not.&lt;br&gt;
&lt;p&gt;
And it's poor Windows support means that it's not usefull as a replacement for Samba.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
But it's nice.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
The OpenAFS points out a huge problem for Linux in general though.  AFS is ancient. It's old old old. It's like X Windows/Athena/Kerberos ancient. Still, even with it's age, it's still MUCH more sophisticated then NFS or SMB network protocols. Nobody has realy produced anything better.&lt;br&gt;
&lt;p&gt;
Lustre, maybe. It certainly has a lot of cool features and is fast. But I don't think that it has any security.&lt;br&gt;
&lt;p&gt;
Supports lots of stuff. TCP networking, ininaband, and all sorts of other bizzare interconnects. &lt;br&gt;
&lt;p&gt;
Supports ACLs, extended ACLs, extended attributes. Lots of high aviability and high performance features. Failover, extra redudancy. You can use it as root FS. It supports Quotas.&lt;br&gt;
&lt;p&gt;
It doesn't require special patches and kernel recompiles for Linux client support.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
The only thing that it lacks is robust security. They plan on supporting GSSAPI and Kerberos with the 1.8.0 release. This is due out by the end of this year according to their roadmap...&lt;br&gt;
For Unix and Windows comaptability it supports SMB and NFS v2/v3/v4 export.&lt;br&gt;
&lt;p&gt;
&lt;p&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/227000/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/227000/rss</link>
      <dc:date>2007-03-21T00:02:55+00:00</dc:date>
      <dc:creator>saffroy</dc:creator>
      <description>
        &quot;There's no good solution today.&quot;&lt;br&gt;
&lt;p&gt;
Of course there is one, it's called OpenAFS. Except installing and managing an AFS cell is not quite easy (although the Debian packager provides useful scripts to quickly start a simple server)...&lt;br&gt;
&lt;p&gt;
But AFS has its share of good features:&lt;br&gt;
 - good Linux and Windows clients (and other *nixes as well)&lt;br&gt;
 - Kerberos authentication&lt;br&gt;
 - powerful ACLs&lt;br&gt;
 - excellent scalability: large sites (MIT, CMU...) run cells with thousands of clients (and several servers sharing the load of course)&lt;br&gt;
 - client-side caching using local storage&lt;br&gt;
 - the namespace is global, users can use the same paths on all clients&lt;br&gt;
 - data is organized in volumes (subtrees) that can be mounted anywhere in the global namespace, both by admins and users&lt;br&gt;
 - volumes can be snapshotted, and users can mount snapshots (no need for the admin to restore from tapes when user accidentally deletes a file and notices immediately)&lt;br&gt;
&lt;p&gt;
...and more. :)&lt;br&gt;
&lt;p&gt;
I've used OpenAFS on Linux for years, and never complained (it helps that I have a competent and knowledgeable admin).&lt;br&gt;
&lt;p&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/226999/rss">
      <title>Persistant allocation without zeroing :== bug</title>
      <link>http://lwn.net/Articles/226999/rss</link>
      <dc:date>2007-03-20T23:46:59+00:00</dc:date>
      <dc:creator>saffroy</dc:creator>
      <description>
      The most common use I see for this feature is to have (more) predictable performance, and no allocation error (ENOSPC) when *writing* to a file (eg. for realtime apps that capture data). If you plan to capture large amounts of data, zeroing blocks is not very convenient...&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/226996/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/226996/rss</link>
      <dc:date>2007-03-20T23:25:22+00:00</dc:date>
      <dc:creator>drag</dc:creator>
      <description>
      I doubt it's usefull for large numbers of users, but for my personal stuff the fuse-based Sshfs is a superior replacement to NFS or Samba.&lt;br&gt;
&lt;p&gt;
Faster, strong encryption, strong authentication aviable, trivially easy to setup. Robust.&lt;br&gt;
&lt;p&gt;
The downside is that you can't use it for anything that requires special file typs, like named pipes. So ~/ is out. No booting from it. But for serving up large media files or sharing abritrary directories between computers it's great.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/226991/rss">
      <title>Persistant allocation without zeroing :== bug</title>
      <link>http://lwn.net/Articles/226991/rss</link>
      <dc:date>2007-03-20T22:28:19+00:00</dc:date>
      <dc:creator>k8to</dc:creator>
      <description>
      Giving them the benefit of the doubt, perhaps they mean that the space is allocated, but the zeroing is done lazily.  Ie. hand zeros to reading apps reading uninitalized space, or perhaps errors, or whatever.  Ie. allocation doesn't necessarily mean you get to see what's there.  Yes allowing apps to read uninitialized reclaimed disk space leads to data leaks.&lt;br&gt;
      
      </description>
    </item>
    <item rdf:about="http://lwn.net/Articles/226974/rss">
      <title>The 2007 Linux Storage and File Systems Workshop</title>
      <link>http://lwn.net/Articles/226974/rss</link>
      <dc:date>2007-03-20T20:03:01+00:00</dc:date>
      <dc:creator>k8to</dc:creator>
      <description>
      The piece I haven't seen so far is &quot;here's how to configure samba to remove all the crap you don't care about if you haven't got windows&quot;.  Granted, the Samba 4 hasn't shipped, but a document or setting along those lines would be useful *now*.  Does it exist?&lt;br&gt;
&lt;p&gt;
When I went looking I found a whole lot of complexity.  I was really *not* interested in learning all the details of SMB and windows networking, I just wanted my Linux machines to share files.&lt;br&gt;
      
      </description>
    </item>
</rdf:RDF>

