User: Password:
|
|
Subscribe / Log in / New account

Wrong direction!

Wrong direction!

Posted Nov 5, 2008 13:07 UTC (Wed) by kev009 (subscriber, #43906)
Parent article: Linux and object storage devices

I think this is certainly the wrong direction. Manufacturers can't even get hard disk firmwares let alone RAID right much of the time. Expecting them to to do even more complex tasks is asking for trouble.. not to mention obsolescence: they likely wont release updates after 2 or so years if they are feeling generous.

The future of mainstream storage is squarely in solid state technology such as flash memory, holographic storage, memresitors, etc. It only makes sense to treat these as low as possible: arrays of raw memory addresses just like RAM.

With many core processors, it seems foolish to even think about this these days.


(Log in to post comments)

Wrong direction!

Posted Nov 5, 2008 13:12 UTC (Wed) by kev009 (subscriber, #43906) [Link]

Also, input from experts at EMC, NetApp, Seagate, IBM and Sun/StorageTek would be critical because they have been doing things like this for a long time.

In the end I still think a software layer is the only way to go because it can be developed and improved in the open as FOSS.

Wrong direction!

Posted Nov 5, 2008 15:22 UTC (Wed) by zlynx (subscriber, #2285) [Link]

Except that even RAM isn't just RAM anymore. No one who cares about performance treats memory like a big random access array. With multi-level caches and sequential prefetch, RAM is more like disk blocks. Flash memory is even more so, with its relatively slow read start and very slow write.

When programmers don't understand and account for the underlying nature of the system, it results in awful code, like most Java software. So don't hide too much of it.

Wrong direction!

Posted Nov 6, 2008 3:32 UTC (Thu) by gdt (subscriber, #6284) [Link]

It only makes sense to treat these as low as possible: arrays of raw memory addresses just like RAM.

That's hardly a low as possible abstraction. For rotating storage it hides bad block remapping. For flash storage it hides wear levelling and delete-before-write.

The search isn't for the lowest possible abstraction to present to the computer, the search is for the abstraction which best mediates between the needs of the computer and the needs of the storage. Storage is increasingly remote and managed, and the current block-based abstraction and your offset-based proposal don't give enough information to the storage's management software.

It's the diversity of storage media that's currently driving object-based storage. It's a lot simpler to build complex storage (with features like migration between flash, disk and tape) if the storage is told what blocks are in a file rather than being left to guess.

Someone asked, why not use a filesystem such as CIFS or NFS? The answer is that this leads to user-based authentication, which leads to a lot of unnecessary complexity for the storage. One of the aims of OBS is to allow storage to be leased out, and integrating with customers' authentication systems would have introduced a big hurdle.

Please note that I'm not a OBS defender, I'm only seeking to explain it. Conversely, I'm also not saying that OBS is such a poor idea that it shouldn't be in Linux. My own view is that the SCSI protocol itself is now inadequate for enterprise storage, as it is a poor fit for the link, network and transport protocols used in corporate networks. I don't see much sense in using a disk protocol to communicate between a computer and a storage manager (ie, another computer). There's a lot of pretence happening there which could be stripped away for better performance and robustness. My view may be overly coloured by experience as a participant in the iSCSI working group

It was wrong back then and it's wrong now

Posted Nov 6, 2008 16:20 UTC (Thu) by khim (subscriber, #9252) [Link]

For rotating storage it hides bad block remapping. For flash storage it hides wear levelling and delete-before-write.

Yes - and I've certainly had problems with the first and I'm sure I'll have problems with the second. The only sane way to resolve this problem is to offer as low access as possible - but not lower. I don't think checksum calculation for blocks on HDD belongs to OS kernel (it can be calculated in HDD more or less for free, but general-purpose CPU will spend significant power doing it), but bad blocks handling certainly should belong to kernel - it have more resources to cope.

The search isn't for the lowest possible abstraction to present to the computer, the search is for the abstraction which best mediates between the needs of the computer and the needs of the storage.
Puhlease. What have this search offered us now? Predictable and mostly unlreliable HDDs and SSDs? I'd prefer raw flash, thank you.
It's a lot simpler to build complex storage (with features like migration between flash, disk and tape) if the storage is told what blocks are in a file rather than being left to guess.
What is the goal: great storage subsystem or great system? If the latter then all these things must be done at the system level (and may be offered via NFS/CIFS).
Please note that I'm not a OBS defender, I'm only seeking to explain it. Conversely, I'm also not saying that OBS is such a poor idea that it shouldn't be in Linux.
OBS is quite bad idea but Linux will need some support for it anyway. And it's useful is some strange places (for example in KVM/VMWare/Xen).

It was wrong back then and it's wrong now

Posted Nov 8, 2008 3:15 UTC (Sat) by Ze (guest, #54182) [Link]

Yes - and I've certainly had problems with the first and I'm sure I'll have problems with the second. The only sane way to resolve this problem is to offer as low access as possible - but not lower. I don't think checksum calculation for blocks on HDD belongs to OS kernel (it can be calculated in HDD more or less for free, but general-purpose CPU will spend significant power doing it), but bad blocks handling certainly should belong to kernel - it have more resources to cope. Yes and what about the bandwidth requirements of sending the checksum over? or if someone wishes to use that space for an error correcting code or a more secure hash? Ultimately the time spent transferring files from the disk should be a small percentage compared to the time spent waiting for them. However there are always going to be trade offs with flexibility , speed and other things. What is the goal: great storage subsystem or great system? If the latter then all these things must be done at the system level (and may be offered via NFS/CIFS). OBS is quite bad idea but Linux will need some support for it anyway. And it's useful is some strange places (for example in KVM/VMWare/Xen). The whole point though is that NFS and CIFS are unsuitable for some uses. This is where Object based file systems come in , they are like a stripped down form of NFS/CIFS. Ideally it'd be nice to have a nice layer setup that allows people to see the layers they want and not have to put up with the layers they don't need. One downside I see for object based file systems is with versioning file systems , which use the layout of the block layer to make having multiple versions cheap in space. That's a trade off though that may be worth it to some and not to others.

It was wrong back then and it's wrong now

Posted Nov 9, 2008 0:37 UTC (Sun) by giraffedata (subscriber, #1954) [Link]

For rotating storage it hides bad block remapping.
Bad block remapping? It hides all block mapping. In a truly low level disk interface, Linux would address the disk by cylinder/head/sector. Indeed, there are things Linux could do more effectively if it controlled the storage at that level. And that's nowhere near the lowest conceivable abstraction, either.

I don't think checksum calculation for blocks on HDD belongs to OS kernel (it can be calculated in HDD more or less for free, but general-purpose CPU will spend significant power doing it)

I see no reason for it to be a cheaper computation in the HDD than in the main computer. If there are special purpose processors in the HDD to do it cheaply, it's because that's where we've decided to do it; not vice versa.

the search is for the abstraction which best mediates between the needs of the computer and the needs of the storage.

I'd like to put it differently, because it's not what I can do for the computer and the storage, but what they can do for me. So: Which abstraction best leverages the abilities of the computer and those of the storage, to provide the most efficient storage service?

There was a time when the best dividing line was such that the main computer watched the bits stream off the head until it detected the start of a record, etc. That let us consolidate expensive CPUs. Today, we can squeeze more storage service out cheaper by moving a great deal of that function to the other end of the cable. More recent technological changes might mean it's most efficient for file layout to move out there too.

I can think of a few reasons to stick function inside the storage product and the storage box instead of the main computer products and box:

  • The implementation must change more when the storage layer below it changes than when the application layer above it does.
  • Multiple main computers share the storage box.
  • It's expensive to move information over the cable.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds