LinuxCon: Kernel roundtable covers more than just bloat
If you have already heard about the kernel roundtable at LinuxCon, it
is likely due to Linus Torvalds's statement that the kernel
is "huge and bloated
". While much of the media focused on
that soundbite, there was quite a bit more to the panel session. For one
thing, Torvalds definitely validated the
impression that the development process is working better than it ever has,
which has made his job an "absolute pleasure
" over the last
few months. In addition, many other topics were discussed, from
Torvalds's motivations to the lessons learned in the 2.6 development
series—as well as a bit about bloat.
![[Roundtable]](https://static.lwn.net/images/lc-roundtable-sm.jpg)
The panel consisted of five kernel developers: Torvalds, Greg
Kroah-Hartman of Novell, Chris Wright of Red Hat, Jonathan Corbet of LWN,
and Ted Ts'o of IBM (and CTO of the Linux Foundation) sitting in for Arjan
van de Ven who got held up in the Netherlands due to visa problems. James
Bottomley of Novell moderated the panel and set out to establish the
ground rules by noting that he wanted to "do as little work as
possible
", so he wanted questions from the audience, in particular
those that would require answers from Torvalds as "he is sitting up
here hoping to answer as little as possible
". Bottomley was
reasonably successful in getting audience questions, but moderating the
panel probably took a bit more effort than he claimed to be looking for.
Innovative features
Bottomley began with a question about the "most innovative
feature
" that went into the kernel in the last year. Wright noted
that he had a "virtualization slant
", so he pointed to the
work done to improve "Linux as a hypervisor
", including memory
management improvements that will allow running more virtual machines more
efficiently under Linux. Corbet and Ts'o both pointed to the ftrace and
performance counters facilities that have been recently added. Tracing and
performance monitoring have both been attacked in various ways over the
years, without getting into the mainline, but it is interesting to see
someone approach "the problem from a different direction, and then
things take off
", Corbet said.
Bottomley altered the question somewhat for Kroah-Hartman, enquiring about
the best thing that had come out of the staging tree that Kroah-Hartman
maintains. That seemed to stump him momentarily, so he mentioned the USB 3.0
drivers as an innovative feature added to the kernel recently, noting that Linux is the
first OS to have a driver for that bus, when hardware using it is still not
available to buy: "It's pretty impressive
". After a moment's
thought, though, Kroah-Hartman pointed out that he had gotten Torvalds's
laptop to work by using a wireless driver from the staging tree, which
completely justified that tree's existence.
Ts'o also noted the kernel mode switching support for graphics devices as
another innovative feature, pointing out that "it means that the X server
no longer has to run as root—what a concept
". He also
suggested that it made things easier for users who could potentially get
kernel error messages in the event of a system hang, without having to hook up
a serial console.
Making it easy for Linus
Torvalds took a "different tack
" on the question, noting that
he was quite pleased with "how much easier my job has been getting in
the last few months
". He said that it is a feature that is not
visible to users but it is the feature that is most important to
him, and that, in the end, "it improves, hopefully, the kernel in
every area
".
Because subsystem maintainers have focused on making it "easy for
Linus
" by keeping their trees in a more mergeable state, Torvalds
has had more time to get involved in other areas. He can participate in
more threads on linux-kernel and "sometimes fix bugs too
". He
clearly is enjoying that, especially because "I don't spend all my
time just hating people that are sending merge requests that are hard to
merge
".
Over the last two merge windows (including the just completed 2.6.32
window), things have been going much more smoothly. Smooth merges mean
that Torvalds gets a "happy feeling inside that I know what I am
merging — whether it works or not [is a] different issue
". In
order to know what he is merging, Torvalds depends on documentation and
commit messages in the trees that outline what the feature is, as well as
why people want it. In order to feel comfortable that the code will
actually work, he bases that on his trust of the person whose tree he is
merging to "fix up his problems afterwards
".
Motivation
The first question from the audience was directed at Torvalds's motivation,
both in the past and in the future. According to Torvalds, his motivation
for working on the kernel has changed a lot over the years. It started
with an interest in low-level programming that interacted directly with the
hardware, but has slowly morphed into working with the community, though
"I shouldn't say 'the community', because when anyone else says 'the
community', my hackles rise [...] there's no one community
". It is
the social aspect of working with other people on the kernel project that
is his main motivation today, part of which is that "I really enjoy
arguing
".
Torvalds's technical itch has already been scratched, so other things
keep him going now: "All of my technical problems were solved so long
ago that I don't even care [...] I do it because it's interesting and I
feel like I am doing something worthwhile
". He doesn't see that
changing over the next 5-10 years, so, while he wouldn't predict the
future, there is a clear sense that things will continue as they
are—at least in that time frame.
Malicious code
Another question from the audience was about the increasing rate of kernel
contributions and whether that made it harder to keep out malicious code
from people with bad intentions. Kroah-Hartman said that it is hard to
say what is malicious code versus just a bug, because "bugs are
bugs
". He said he doesn't remember any recent attempts to
intentionally introduce malicious code.
Torvalds pointed out that the problem has never been people intentionally
doing something bad, but, instead, trying to do something good and
unintentionally ending up causing a security hole or other bug. He did
note an attempt to introduce a back door into the kernel via the
BitKeeper
repository 7-8 years ago which "was caught by BitKeeper with
checksums, because they [the attackers] weren't very good at it
". While that is the
only case he is aware of, "the really successful ones we wouldn't
know about
".
One of Git's
design goals was to keep things completely decentralized and to
cryptographically sign all of the objects so that a compromise of a public
git server would be immediately recognized, because it didn't match others'
private trees, he said.
Performance regressions
Bottomley then turned to performance regressions, stating that Intel had
been running a "database benchmark that we can't name
" on
every kernel release. They have found that the performance drops a
couple of percentage points each release, with a cumulative effect over the
last ten releases of about 12%. Torvalds responded that the kernel is
"getting bloated and huge, yes, it's a problem
".
"I'd love to say we have a plan
" for fixing that, Torvalds
said but it's not the case. Linux is "definitely not the
streamlined, small, hyper-efficient kernel that I envisioned 15 years
ago
"; the kernel has gotten large and "our icache
[instruction cache] footprint is scary
". The performance regression is
"unacceptable, but it's probably also unavoidable
" due to the
new features that get added with each release.
Audio and storage
In response to a question about professional audio, Torvalds said that the
sound subsystem in the kernel was much better than it is given credit for,
especially by "crazy
" Slashdot commenters who pine for the
days of the Open Sound System (OSS). Corbet also noted that audio issues
have gotten a lot better, though, due to somewhat conflicting
stories from the kernel developers over the years, audio developers
"have had a bit of a rough ride
".
A question about the need for handling memory failures, both in RAM and
flash devices, led Ts'o to note that, based on his experience at a recent
storage conference, there is "growing acceptance of the fact that
hard disks aren't going away
". Hard disks will always be cheaper,
so flash will be just be another element in the storage hierarchy. The
flash hardware itself is better placed to know about and handle failures of
its cells, so that is likely to be the place where it is done, he said.
Lessons learned
The lessons learned during the six years of the 2.6 development model was
the subject of another question from Bottomley. Kroah-Hartman pointed to
the linux-next tree as part of a better kernel development infrastructure
that has led to more effective collaboration:
"We know now how to work better together
". Corbet noted that
early 2.6 releases didn't have a merge window, which made stability of
those releases suffer. "What we've learned is some
discipline
", he said.
In comparing notes with the NTFS architect from Microsoft, Ts'o related that
the core Windows OS team has a similar development model. "Redmond
has independently come up with something almost identical to what
we're doing
", he said. They do quarterly releases, with a merge
period followed by a stabilization period. Microsoft didn't copy the Linux
development model, according to the NTFS architect, leading he and Ts'o to
theorize that when doing
development "on that scale, it's one of the few things that actually
works well
". That led Bottomley to jokingly suggest a headline:
"Microsoft validates Linux development model
".
Torvalds also noted that the development model is spreading:
"The kernel way of doing things has clearly entered the 'hive mind' when it
comes to open source
". Other projects have adopted many of the
processes and tools that the kernel developers use, but also things like
the sign-off process that was added in response to the SCO mess. Sign-offs
provide a nice mechanism to see how a particular chunk of code reached the
mainline, and other projects are finding value in that as well.
Overall, the roundtable gave an interesting view into the thinking of
the kernel developers. It was much more candid than a typical
marketing-centric view that comes from proprietary OS vendors. Of course,
that led to the "bloated" headlines that dominated the coverage of the
event, but it also gave the audience an unvarnished look at the kernel.
The Linux Foundation and Linux Pro magazine have made a video of the
roundtable available—unfortunately only in Flash format—which may be
of interest; it certainly was useful in augmenting the author's notes.
Index entries for this article | |
---|---|
Conference | LinuxCon North America/2009 |
Posted Oct 1, 2009 1:02 UTC (Thu)
by flewellyn (subscriber, #5047)
[Link] (23 responses)
Posted Oct 1, 2009 4:31 UTC (Thu)
by dowdle (subscriber, #659)
[Link] (17 responses)
That isn't to say I want the kernel to keep getting slower.
If you want the video in a different format, I'd recommend visiting the page, pausing the video and waiting until it is completely buffered. Then you can copy /tmp/Flash{random-characters} to ~ and convert it to whatever format you want. Of course it would be nice to have a higher quality source to convert from but it isn't too bad.
Posted Oct 1, 2009 9:32 UTC (Thu)
by gevaerts (subscriber, #21521)
[Link] (2 responses)
Posted Oct 1, 2009 15:28 UTC (Thu)
by Velmont (guest, #46433)
[Link] (1 responses)
Posted Oct 2, 2009 2:16 UTC (Fri)
by DOT (subscriber, #58786)
[Link]
Posted Oct 1, 2009 9:46 UTC (Thu)
by alex (subscriber, #1355)
[Link] (3 responses)
Posted Oct 4, 2009 17:26 UTC (Sun)
by nevets (subscriber, #11875)
[Link] (1 responses)
One might argue that we've become 12% slower, but > 12% more secure.
Posted Oct 5, 2009 10:43 UTC (Mon)
by alex (subscriber, #1355)
[Link]
For example once you have validated a process can read a given file descriptor do you need to re-run the whole capability checking logic for every sys_read()?
Of course any such caching probably introduces another attack vector so care would have to be taken with the implementation?
*ideal being a target even if you may never actually reach that goal.
Posted Oct 8, 2009 7:01 UTC (Thu)
by kragil (guest, #34373)
[Link]
In the interview he said that "at least Linux isn't this fat ugly pig that should have been shot 15 years ago"
I'd like to think that Linus is so bright that the bloat statement was intentional to get the kernel community working on a solution (don't tell me there isn't one that is way too easy), but he probably does not have these mad Sun Tzu communication skillz.
Maybe next time add the the pig comment to put things into perspective for the media?
Posted Oct 1, 2009 10:26 UTC (Thu)
by job (guest, #670)
[Link]
Considering what 5% extra performance costs in hardware in the mainstream segment, running a year old kernel would give you the same benefit (if you don't care about the newest features). That's not a good situation! Regarding the video, you can wget this link and feed it in mplayer/xine if you have a recent ffmpeg with VP6 installed.
Posted Oct 2, 2009 18:09 UTC (Fri)
by simonl (guest, #13603)
[Link] (8 responses)
When you argue that way, you take no pride in your code. And pride is what built this kernel.
Maybe the kernel devs have become too employed. Too focused on solving customers' immediate needs, and having too little time to go hunt down whatever catches their attention, big or small, with no prospects to please a project manager.
But look what Apple just did in their latest release: Nothing spectacular, except cleaning up. Someone has had the guts to nack new features and focus on removal.
Posted Oct 4, 2009 9:41 UTC (Sun)
by Los__D (guest, #15263)
[Link] (4 responses)
You could have a LIGHTNING FAST DOS box today. What good would it do you?
Posted Oct 5, 2009 10:34 UTC (Mon)
by eru (subscriber, #2753)
[Link] (3 responses)
But if you don't use the features, you should not need to pay the price! If I understand correctly, the slowdown has been seen in repeatable benchmarks that can be run on both old and new kernel versions. Therefore the benchmarked code certainly isn't using any new features, but it still gets slowed down. Not justifiable.
You could have a LIGHTNING FAST DOS box today. What good would it do you?
Bad comparison. MS-DOS always had severe problems that really did not have much to do with its small footprint. It was brain-damaged already on day one. An OS that does more or less what MS-DOS did, but in a sensible and stable way might still be useful.
Posted Oct 8, 2009 8:20 UTC (Thu)
by renox (guest, #23785)
[Link] (1 responses)
Linus referred to the icache footprint(size) of the kernel, if you add features, even when not used they increased the size of the generated code so they reduce the performance.
Without specific figures, it's difficult to know where the issue is, I wouldn't be surprised that SELinux or virtualisation are the culprit: these features seems quite invasive..
Posted Oct 8, 2009 9:03 UTC (Thu)
by dlang (guest, #313)
[Link]
so to not use the feature of SELinux you would compile a kernel without it.
the same thing goes for many fetures, turning them on at compile time increases the cache footprint and therefor slows the system, even if you don't use the feature. but you (usually) do have the option to not compile the code into the kernel to really avoid the runtime cost of them.
Posted Oct 8, 2009 9:10 UTC (Thu)
by bersl2 (guest, #34928)
[Link]
Then configure out what you don't want already. Really, you think going with your distro's generic kernel is efficient? It doesn't take very long to find /proc/config* and take out some of the above-mentioned features that can't be modular. That, or yell at your distro, for the little good that will do.
Posted Oct 4, 2009 12:04 UTC (Sun)
by fuhchee (guest, #40059)
[Link]
I suspect it's the other way around. Many customers care deeply about performance, and it is their vendors who must perform code-karate against this kind of "bloat" (slowdown). To justify each new thing, LKML rarely carries data beyond microbenchmarks.
Posted Oct 5, 2009 8:11 UTC (Mon)
by cmccabe (guest, #60281)
[Link]
Well, maybe, you can have clean code that runs 12% slower, or you can have code that's #ifdef'ed to hell that runs at the old speed. In that case, which would you rather have?
Keep in mind, if you choose route #2, people in 2015 might use your name as a curse...
Obviously this is an oversimplification. But still, the point remains: don't criticize the code until you've seen it and understand the tradeoffs.
Posted Oct 14, 2009 9:52 UTC (Wed)
by gvy (guest, #11981)
[Link]
Posted Oct 1, 2009 16:20 UTC (Thu)
by smoogen (subscriber, #97)
[Link] (2 responses)
Posted Oct 1, 2009 16:43 UTC (Thu)
by flewellyn (subscriber, #5047)
[Link] (1 responses)
Posted Oct 2, 2009 0:25 UTC (Fri)
by giraffedata (guest, #1954)
[Link]
Posted Oct 2, 2009 4:48 UTC (Fri)
by karthik_s1 (guest, #60525)
[Link] (1 responses)
Posted Oct 14, 2009 10:03 UTC (Wed)
by gvy (guest, #11981)
[Link]
Posted Oct 1, 2009 16:39 UTC (Thu)
by josh (subscriber, #17465)
[Link]
Posted Oct 2, 2009 9:31 UTC (Fri)
by dwmw2 (subscriber, #2063)
[Link] (4 responses)
I think it's the wrong approach. It was all very well letting "intelligent" drives remap individual sectors underneath us so that we didn't have to worry about bad sectors or C-H-S and interleaving. But what the flash drives have to do to present a "disk" interface is much more than that; it's wrong to think that the same lessons apply here.
What the SSD does internally is a file system all of its own, commonly called a "translation layer". We then end up putting our own file system (ext4, btrfs, etc.) on top of that underlying file system.
Do you want to trust your data to a closed source file system implementation which you can't debug, can't improve and — most scarily — can't even fsck when it goes wrong, because you don't have direct access to the underlying medium?
I don't, certainly. The last two times I tried to install Linux to a SATA SSD, the disk was corrupted by the time I booted into the new system for the first time. The 'black box' model meant that there was no chance to recover — all I could do with the dead devices was throw them away, along with their entire contents.
File systems take a long time to get to maturity. And these translation layers aren't any different. We've been seeing for a long time that they are completely unreliable, although newer models are supposed to be somewhat better. But still, shipping them in a black box with no way for users to fix them or recover lost data is a bad idea.
That's just the reliability angle; there are also efficiency concerns with the filesystem-on-filesystem model. Flash is divided into "eraseblocks" of typically 128KiB or so. And getting larger as devices get larger. You can write in smaller chunks (typically 512 bytes or 2KiB, but also getting larger), but you can't just overwrite things as you desire. Each eraseblock is a bit like an Etch-A-Sketch. Once you've done your drawing, you can't just change bits of it; you have to wipe the whole block.
Our flash will fill up as we use it, and some of the data on the flash will be still relevant. Other parts will have been rendered obsolete; replaced by other data or just deleted files that aren't relevant any more. Before our flash fills up completely, we need to recover some of the space taken by obsolete data. We pick an eraseblock, write out new copies of the data which are still valid, and then we can erase the selected and re-use it. This process is called garbage collection.
One of the biggest disadvantages of the "pretend to be disk" approach is addressed by the recent TRIM work. The problem was that the disk didn't even know that certain data blocks were obsolete and could just be discarded. So it was faithfully copying those sectors around from eraseblock to eraseblock during its garbage collection, even though the contents of those sectors were not at all relevant — according to the file system, they were free space!
Once TRIM gets deployed for real, that'll help a lot. But there are other ways in which the model is suboptimal.
The ideal case for garbage collection is that we'll find an eraseblock which contains only obsolete data, and in that case we can just erase it without having to copy anything at all. Rather than mixing volatile, short-term data in with the stable, long-term data we actually want to keep them apart, in separate eraseblocks. But in the SSD model, the underlying "disk" can't easily tell which data is which — the real OS file system code can do a much better job.
And when we're doing this garbage collection, it's an ideal time for the OS file system to optimise its storage — to defragment or do whatever else it wants (combining data extents, recompressing, data de-duplication, etc.). It can even play tricks like writing new data out in a suboptimal but fast fashion, and then only optimising it later when it gets garbage collected. But when the "disk" is doing this for us behind our back in its own internal file system, we doesn't get the opportunity to do so.
I don't think Ted is right that the flash hardware is in the best place to handle "failures of its cells". In the SSD model, the flash hardware doesn't do that anyway — it's done by the file system on the embedded microcontroller sitting next next to the flash.
I am certain that we can do better than that in our own file system code. All we need is a small amount of information from the flash. Telling us about ECC corrections is a first step, of course — when we had to correct a bunch of flipped bits using ECC, it's getting on for time to GC the eraseblock in question, writing out a clean copy of the data elsewhere. And there are technical reasons why we'll also want the flash to be able to say "please can you GC eraseblock #XX soon".
But I see absolutely no reason why we should put up with the "hardware" actually doing that kind of thing for us, behind our back. And badly.
Admittedly, the need to support legacy environments like DOS and to provide INT 13h "DISK BIOS" calls or at least a "block device" driver will never really go away. But that's not a problem. There are plenty of examples of translation layers done in software, where the OS really does have access to the real flash but still presents a block device driver to the OS. Linux has about 5 of them already. The corresponding "dumb" devices (like the M-Systems DiskOnChip which used to be extremely popular) are great for Linux, because we can use real file systems on them directly.
At the very least, we want the "intelligent" SSD devices to have a pass-through mode, so that we can talk directly to the underlying flash medium. That would also allow us to try to recover our data when the internal "file system" screws up, as well as allowing us to do things properly from our own OS file system code.
Posted Oct 3, 2009 7:45 UTC (Sat)
by job (guest, #670)
[Link]
Posted Oct 5, 2009 13:09 UTC (Mon)
by i3839 (guest, #31386)
[Link] (1 responses)
> But I see absolutely no reason why we should put up with the "hardware"
Three reasons:
- Interoperability without losing flexibility.
- Performance.
- Fast development.
So short term the situation is quite hopeless for direct hardware access.
Long term I think we should get rid of the notion of a disk and go more
What is needed is a flash specific interface which replaces SATA and
Maybe AHCI is good enough after a few adaptations, but it can be probably
This will make embedding the controller on a SoC easier too, without the
I'm pretty sure most flash controllers already have such chip, but don't
I also think it's crucial to get optimal performance, flash gets faster
People that look at SSDs and see them just as disks and don't think about
Posted Oct 6, 2009 6:23 UTC (Tue)
by dwmw2 (subscriber, #2063)
[Link]
But just a brief response...
Yes, we do need to look carefully at the interface we ask for, and make sure it can perform well. But there's no performance-based reason for the SSD model.
Being "wedged between stable interfaces" isn't a boon, in this case.
Because it's wedged under an inappropriate stable interface, we are severely hampered in what we can do with it.
Posted Oct 29, 2009 18:11 UTC (Thu)
by wookey (guest, #5501)
[Link]
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
The causes of bloat?
The causes of bloat?
How much checking do you need to do?
The causes of bloat?
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
Features costs performance.
Features no excuse
Features no excuse
If I understand correctly, the slowdown has been seen in repeatable benchmarks that can be run on both old and new kernel versions. Therefore the benchmarked code certainly isn't using any new features, but it still gets slowed down. Not justifiable.]]
Sure if you have an option to remove the code from the kernel at compilation time, then this issue shouldn't happen.. So which configuration did Intel benchmark?
Features no excuse
Features no excuse
But if you don't use the features, you should not need to pay the price! If I understand correctly, the slowdown has been seen in repeatable benchmarks that can be run on both old and new kernel versions. Therefore the benchmarked code certainly isn't using any new features, but it still gets slowed down. Not justifiable.
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
> apologist view. Sorry, our software sucks, we know, but just buy faster hw
> and make up for our crap.
>
> When you argue that way, you take no pride in your code. And pride is what
> built this kernel.
pride comes before a fall
I hope *not*.
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
Linus was talking about some kind of bloat that makes the kernel slower. I don't think removing device driver or filesystem driver modules from the disk or configuring out most of the configurable features speeds things up.
LinuxCon: Kernel roundtable covers more than just bloat
LinuxCon: Kernel roundtable covers more than just bloat
macro overhead
Oh yeah, that would make those pesky 12% a non-issue. :]
Direct link to the Linux kernel roundtable video.
LinuxCon: Kernel roundtable covers more than just bloat
SSD
"The flash hardware itself is better placed to know about and handle failures of its cells, so that is likely to be the place where it is done, he said."
I was biting my tongue when he said that, so I didn't get up and heckle.
SSD
SSD
> actually doing that kind of thing for us, behind our back. And badly.
For a "dumb" block driver to work the translation table must be fixed.
And when running in legacy mode the drive will be very slow. There are
a heap of problems waiting when it gives write access too.
It's very hard to let people switch filesystems. They mostly use the
default one from their OS, or FAT. As long as disks are sold as units
and not as part of a system this will be true.
Having dedicated hardware doing everything is faster and uses less power.
Not talking about an ARM mc, but a custom asic. (Just compare Intel's
SSD idle/active power usage to others.)
Currently the flash chip interface is standardized with ONFI and the
other end with SATA. All the interesting development happens in-between.
Because it's wedged between stable interfaces it can change a lot without
impacting anything else (except when it's done badly ;-).
The best hope is probably SSDs with free specifications and open source
firmware.
to a model resembling RAM. You could buy arrays of flash and plug them in,
instead of whole disks (they could look like "disks" to the user, but
that's another matter). This decouples the flash from the controller
and makes data recovery easier in case the controller dies.
implements the features that all good flash file systems need: Things
like multi-chip support, ECC handling, background erases and scatter
gather DMA. Perhaps throw in enough support to implement RAID fast
in software, if it makes enough sense. Basically a standardized small,
simple and fast flash controller, preferably accessible via some kind of
host memory interface (PCIe on x86). Make sure to make it flexible enough
to handle things like MRAM/PRAM/FeRAM too.
a lot better. Or perhaps some other interface from the embedded world fits
the description.
need for special support in the OS. SATA is really redundant in such cases.
expose it. Instead they hide it behind an embedded microcontroller which
handles SATA and implements the FTL. They should standardize it and get
rid of the power hungry, complexity adding bloat.
and faster. It seems pointless to get faster and faster SATA when PCIe
is already fast enough. It's also silly to fake SATA by just implementing
a AHCI controller with direct flash access. Keep SATA around for real
external storage, not something that doesn't take much space anyway.
the future will think it's best if the hardware does as much as possible.
But if you forget the classic disk model and look at what's really going
on it seems obvious that the classic disk model isn't that simple anyway
and doesn't fit flash or how the hardware looks like and could be used.
We should probably take this discussion elsewhere. Your input would be welcome on the MTD list, where I've started a thread about what we want the hardware to look like, if we could have it our way.SSD
"- Interoperability without losing flexibility."
This is still possible with a more flexible hardware design — you just implement the translation layer inside your driver, for legacy systems. M-Systems were doing this years ago with the DiskOnChip. More recently, take a look at the Moorestown NAND flash driver. You can happily use FAT on top of those. But of course you do have the opportunity to do a whole lot better, too. And also you have the opportunity to fix the translation layer if/when it goes wrong. And to recover your data."- Performance"
But this isn't being done in hardware. It's being done in software, on an extra microcontroller."- Fast development"
You jest, surely? We had TRIM support for FTL in Linux last year, developed in order to test the core TRIM code. When do we get it on "real hardware"? This year? Next?
"People that look at SSDs and see them just as disks and don't think about
the future will think it's best if the hardware does as much as possible.
But if you forget the classic disk model and look at what's really going
on it seems obvious that the classic disk model isn't that simple anyway
and doesn't fit flash or how the hardware looks like and could be used."
Agreed. I think it's OK for the hardware to do the same kind of thing that disk hardware does for us — ECC, and some block remapping to hide bad blocks. But that's all; we don't want it implementing a whole file system of its own just so it can pretend to be spinning rust. In particular, perpetuating the myth of 512-byte sectors is just silly.
SSD