User: Password:
|
|
Subscribe / Log in / New account

Ext3 and RAID: silent data killers?

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 21:49 UTC (Mon) by me@jasonclinton.com (✭ supporter ✭, #52701)
Parent article: Ext3 and RAID: silent data killers?

I'm a little surprised at lack of any discussion of RAID battery backups.
All RAID enclosures and RAID host-adapters worth their salt have a BBU
(battery backup unit) option for exactly this purpose. Why the small block
write buffer on an SSD cannot be backed up by a suitably small zinc-air
battery is a mystery to me.


(Log in to post comments)

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 22:31 UTC (Mon) by proski (subscriber, #104) [Link]

I imagine that at least the more expensive SSDs have something like that, or maybe they can finish the write if the external power is disconnected. But when it comes to flash cards, like those used in digital cameras, the cost difference would be prohibitive.

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 22:34 UTC (Mon) by me@jasonclinton.com (✭ supporter ✭, #52701) [Link]

SD, MemoryStick, and CF cards do not have firmware.

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 22:45 UTC (Mon) by pizza (subscriber, #46) [Link]

of course they have firmware; how else would (for example) a CF card translate the ATA commands into individual read/write ops on the appropriate flash chips and deal with write levelling?

Granted, that "firmware" may be in the form fo mask ROM, but I know of at least one case where a CF card had a firmware update released for it.

SD and MS are a lot simpler, but even they require something to translate the SD/MS wire protocols into flash read/write ops.

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 22:52 UTC (Mon) by me@jasonclinton.com (✭ supporter ✭, #52701) [Link]

Sorry, you're right about CF. I haven't seen one of those in ages.

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 23:27 UTC (Mon) by drag (subscriber, #31333) [Link]

AND SD and Memorystick and any other remotely consumer-related device.

They all are 'smart devices'.

If it was not for the firmware MTD-to-Block translation then you could not use them in Windows and they could not be formatted Fat32.

When I have dealt with Flash in the past, the raw flash type, the flash just appears as a memory region. Like I have this old i386 board I am dealing with that has it's flash just starting at 0x80000 and it goes on for about eight megs or so.

That's it. That's all the hardware does for you. You have to know then how to communicate with it and it's underlining structure and know the proper way to write to it and everything. All that has to be done in software.

I suppose most of that is rather old fashioned.. the flash was soldiered directly into the traces on the board.

I can imagine it would be quite difficult and would require new hardware protocols to allow a OS to manage flash directly properly over something like SATA or USB.

But fundamentally MTD are quite a bit different from Block devices. It's a different class of I/O completely. Just like how a character device like a mouse or a keyboard can't be written to with Fat32. You can fake MTD by running a Block-to-MTD layer on SD flash or a file or anything else and some poeple think that helps with wear leveling, but I think that is foolish and may actually end up being self-defeating as you have no idea how the algorithms in the firmware work.

Ext3 and RAID: silent data killers?

Posted Aug 31, 2009 23:59 UTC (Mon) by BenHutchings (subscriber, #37955) [Link]

AND SD and Memorystick and any other remotely consumer-related device. They all are 'smart devices'.

Not all. SmartMedia, xD and Memory Stick variants provide a raw flash interface - that's a major reason why they have had to be revised repeatedly to allow for higher-capacity chips. They rely on an external controller to do write-buffering, and do not support any wear-leveling layer.

When I have dealt with Flash in the past, the raw flash type, the flash just appears as a memory region

It is possible for a flash controller to map NOR flash into memory since it is random-access for reading. However, large flash chips are all NAND flash which only supports block reads.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 0:00 UTC (Tue) by me@jasonclinton.com (✭ supporter ✭, #52701) [Link]

Isn't the ATA/MMC<->MTD translation done in the consumer "reader" that you stick these devices in? CF is electrically compatible with ATA. That's not even remotely the case with the electrical interfaces on either SD or MS.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 0:52 UTC (Tue) by drag (subscriber, #31333) [Link]

Maybe. I don't think so. At least not for SD.

Remember that SD stands for 'Secure Digital' and is DRM'd. So there has to be some smarts in it to do that.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 6:22 UTC (Tue) by Los__D (guest, #15263) [Link]

Almost no SD support the DRM features, according to Wikipedia.

(Still doesn't change the point, though. SDs are probably designed with internal firmware)

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 9:36 UTC (Tue) by alonz (subscriber, #815) [Link]

That's not what Wikipedia says—they say few devices support CPRM. Which is more-or-less true—almost no devices in the western market use CPRM, while in Japan every single device does (it is required as part of i-Mode, which is mandated by DoCoMo).

As for firmware, the SD card interface (available for free at www.sdcard.org defines accesses in terms of 512-byte “logical” sectors, practically mandating the card to implement a flash translation layer.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 12:49 UTC (Tue) by Los__D (guest, #15263) [Link]

Doh, of course.

I read "devices" as the SD cards themselves.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 16:57 UTC (Tue) by Baylink (guest, #755) [Link]

Generally, I think that's true, yes; the only small-flash technology that actually *looks like an ATA drive at the connector* is CF; the others require a smart reader to do the interfacing -- which may itself *not* look like ATA at the back; there are clearly other better ways to do this stuff.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 17:37 UTC (Tue) by iabervon (subscriber, #722) [Link]

SD is MMC with a few extra features nobody uses. The readers do USB-storage<->SD, but SD is still 512-byte chunks (it's a card-reported value, and the host can actually try changing it, but 512 is the only value that is ever supported).

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 3:16 UTC (Tue) by zlynx (subscriber, #2285) [Link]

All the RAID discussion on the list was about the Linux MD/DM software RAID. It isn't as reliable as other options.

From what I gather, MD does not use write-intent logging by default, and when it is enabled it is very inefficient. Probably because it doesn't spread the write intent logs around the disks. Also, MD does not detect a unclean shutdown, so it does not start a RAID scrub and go into read+verify mode. And all that is a problem even when the array isn't degraded.

And of course it doesn't have a battery backup. :)

All that said, Linux MD hasn't given me any problems, and I prefer it over most cheap hardware RAID.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 5:14 UTC (Tue) by neilbrown (subscriber, #359) [Link]

I don't think "is very inefficient" is correct. There is a real performance impact, but the size of that impact is very dependent on workload and configuration. It is easy to add or remove the write intent logging while the array is active, so there is minimal cost in experimenting to see what impact it really has in an given usage.

And MD most certainly does detect an unclean shutdown and will validate all parity block on restart.

But you are right that it doesn't have battery backup. If fast NVRAM were available on commodity server hardware, I suspect we would get support for it in md/raid5 in fairly short order.

Ext3 and RAID: silent data killers?

Posted Sep 1, 2009 15:30 UTC (Tue) by zlynx (subscriber, #2285) [Link]

As far as I could tell, MD will not verify parity during regular reads while the array is unclean.

It may start a background verify, although it seemed to me that was dependent on what the distro's startup scripts did...

Ext3 and RAID: silent data killers?

Posted Sep 2, 2009 0:09 UTC (Wed) by Richard_J_Neill (subscriber, #23093) [Link]

> I'm a little surprised at lack of any discussion of RAID battery backups.
> All RAID enclosures and RAID host-adapters worth their salt have a BBU
>(battery backup unit) option for exactly this purpose.

Yes...but it's only good for a few hours. So if your power outage lasts more than that, then the BBWC (battery backed write cache) is still toast.

On a related note, I've just bought a pair of HP servers (DL380s) and an IBM X3550. It's very annoying that there is no way to buy either of these without hardware raid, nor can the raid card be turned off in the BIOS. For proper reliability, I only really trust software (md) raid in Raid 1 mode (with write caching off). [Aside: this kills the performance for database workloads (fdatasync), though the Intel X25-E SSDs outperform 10k SAS drives by a factor of about 12.]

Ext3 and RAID: silent data killers?

Posted Sep 2, 2009 0:13 UTC (Wed) by dlang (subscriber, #313) [Link]

actually, in many cases the batteries for the raid cards can last for several weeks.

Ext3 and RAID: silent data killers?

Posted Sep 4, 2009 10:32 UTC (Fri) by nix (subscriber, #2304) [Link]

Yes indeed. My Areca RAID card claims a month's battery life. Thankfully I've never had cause to check this, but I guess residents of Auckland in the 1998 power-generation fiasco would have liked it. :)

Ext3 and RAID: silent data killers?

Posted Sep 5, 2009 0:01 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

The battery in RAID adapter card only barely addresses the issue, I don't care how long it lives.

But the comment also addressed "RAID enclosures," which I take to mean storage servers that use RAID technology. Those, if they are at all serious, have batteries that power the disk drives as well, and only for a few seconds -- long enough to finish the write. It's not about backup power, it's about a system in which data is always consistent and persistent, even if someone pulled a power cord at some point.

Ext3 and RAID: silent data killers?

Posted Sep 5, 2009 0:31 UTC (Sat) by dlang (subscriber, #313) [Link]

actually, there are a LOT of enclosures that don't provide battery backup for the drives at all, just for the cache.

it's possible that they have heavy duty power supplies that keep power up for a fraction of a second after the power fail signal goes out to the drives, but they defiantly do not keep the drives spinning long enough to flush their caches

Ext3 and RAID: silent data killers?

Posted Sep 7, 2009 22:47 UTC (Mon) by nix (subscriber, #2304) [Link]

Ah, I see, the point is that even if you turn off the power *and pull the
disk* halfway through a write, the disk state is still consistent? Yeah,
battery-backed cache alone obviously can't ensure that.

Ext3 and RAID: silent data killers?

Posted Sep 7, 2009 23:18 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

Ah, I see, the point is that even if you turn off the power *and pull the disk* halfway through a write, the disk state is still consistent? Yeah, battery-backed cache alone obviously can't ensure that.

No one said anything about pulling a disk. I did mention pulling a power cord. I meant the power cord that supplies the RAID enclosure (storage server).

A RAID enclosure with a battery inside that powers only the memory can keep the data consistent in the face of a power cord pull, but fails the persistence test, because the battery eventually dies. I think when people think persistent, they think indefinite. High end storage servers do in fact let you pull the power cord and not plug it in again for years and still be able to read back all the data that was completely written to the server before the pull. Some do it by powering disk drives (not necessarily the ones that normally hold the data) for a few seconds on battery.

Also, I think some people expect of persistence that you can take the machine, once powered down, apart and put it back together and the data will still be there. Battery backed memory probably fails that test.

Ext3 and RAID: silent data killers?

Posted Sep 8, 2009 4:56 UTC (Tue) by dlang (subscriber, #313) [Link]

I don't know what 'high end storage servers' you are talking about, the even the multi-million dollar arrays from EMC and IBM do not have the characteristics that you are claiming.

Ext3 and RAID: silent data killers?

Posted Sep 8, 2009 6:25 UTC (Tue) by giraffedata (subscriber, #1954) [Link]

I don't know what 'high end storage servers' you are talking about, the even the multi-million dollar arrays from EMC and IBM do not have the characteristics that you are claiming.

Now that you mention it, I do remember that earlier IBM Sharks had nonvolatile storage based on a battery. Current ones don't, though. The battery's only job is to allow the machine to dump critical memory contents to disk drives after a loss of external power. I think that's the trend, but I haven't kept up on what EMC, Hitachi, etc. are doing. IBM's other high end storage server, the former XIV Nextra, is the same.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds