McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
Posted Nov 22, 2014 2:15 UTC (Sat) by ncm (guest, #165)In reply to: McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug by reedstrm
Parent article: McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
But when it matters, there's no substitute for battery backup and a powered-up shutdown. Five minutes of backup is plenty if after four minutes you start a shutdown. Thirty seconds is plenty if you panic the kernel at the first hiccup and let the drive drain its buffer in peace.
Posted Nov 22, 2014 5:00 UTC (Sat)
by dlang (guest, #313)
[Link] (7 responses)
As long as the disk only corrupts the sector it's in the middle of writing to, postgres will not loose any data that it's reported as safe.
now, if the drive goes off and scribbles on other parts of the drive as it looses power, all bets are off, but such drives really do not exist, they detect the power failure fast enough to stop writing before any mechanical movement is affected.
Posted Nov 25, 2014 14:15 UTC (Tue)
by ncm (guest, #165)
[Link] (6 responses)
If PG are doing power-off tests, you can bet they are stacking the deck with chosen hardware carefully configured to minimize problems, because if they did find corruption, there is nothing they could do to prevent it next time. The odds are that any given power drop won't corrupt much. Testing fewer than 10,000 power drops is just theater.
Posted Nov 25, 2014 15:38 UTC (Tue)
by magila (guest, #49627)
[Link] (2 responses)
Posted Dec 5, 2014 1:22 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (1 responses)
How could it be more than one sector?
I haven't heard of this technology, but I'm very familiar with quite old technology in which the drive detects supply voltage about to sink below a safe level and cuts the write current to the head immediately, so the damage is limited to one sector. This technology doesn't even involve instructions executing, so there is no reserve energy to speak of required.
Posted Dec 11, 2014 20:57 UTC (Thu)
by jimparis (guest, #38647)
[Link]
Shingled Magnetic Recording (SMR) overlaps tracks, so you're always corrupting the next track whenever you write one. The next track needs to get rewritten too (until you hit a gap between tracks).
Posted Nov 25, 2014 17:11 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
The whole point of which is to save the same data twice, and thus enable the system to detect that something bad has happened. Of course, what it does after detecting trouble depends on how bad the trouble is, but it is mandatory that it either replays the log to properly update the data, or reverts the log to properly undo the transaction.
It doesn't matter WHAT happens to the disk, so long as there is not a disk failure that randomly scribbles over the disk, Postgres (and pretty much any other database) will provide guarantees that enable you to get back to a clean state, either pre- or post- attempted write.
For example, in the database I want to write I'm planning to use COW. Provided (big if :-( the OS doesn't muck up my user-space write order, that's guaranteed not to corrupt data.
Cheers,
Posted Dec 4, 2014 2:50 UTC (Thu)
by dw (guest, #12017)
[Link] (1 responses)
Posted Dec 8, 2014 9:57 UTC (Mon)
by arnd (subscriber, #8866)
[Link]
I wouldn't be surprised if consumer-grade SSDs have similar problems without such workarounds.
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
pulling the plug on a disk drive
Modern hard drives have hardware to detect a power failure and stop the write head at the next sector boundary... So I in practice you will not see more than two corrupted sectors after a power failure.
pulling the plug on a disk drive
>
> How could it be more than one sector?
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
Wol
Manufacturers have been promising such things, and for a very long time. Here is the manual for a 26 year old drive, where section 2.1.4 clearly indicates “No damage or loss of data will occur if power is applied or removed during drive operation, except that data may be lost in the sector being written at the time of power loss”. (With thanks to Howard Chu for excavating that little nugget)
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
McKenney: Stupid RCU Tricks: rcutorture Catches an RCU Bug
