MVCC implementation
MVCC implementation
Posted Jan 3, 2010 20:21 UTC (Sun) by butlerm (subscriber, #13312)In reply to: So how does PostGreSQL compare to MySQL? by ringerc
Parent article: The ongoing MySQL campaign
design</em>
Not quite. Oracle's MVCC, for example, works something like the following
(I believe InnoDB uses much the same scheme):
1. The latest version of any row is stored in the primary data block, along
with some transaction related meta data.
2. The information necessary to recover prior versions of any row is stored
in separate rollback segments
3. Changes to both are written to the redo log
4. When a query is processed, if the version of the row currently stored in
a data block isn't the one needed, a prior version is recovered by
referring to the appropriate rollback segment.
5. When a transaction commits, or soon thereafter, the rollback data for
that transaction is discarded. No disk I/O required.
6. Alternatively, if a transaction is rolled back, prior versions of
modified rows are recovered from the rollback segment and restored to their
original positions in the primary data blocks.
The main advantage of this scheme is that data blocks generally only get
modified once per transaction, rather than once initially and once some
time later, during the vacuum process. However, redo log overhead is
probably twice as high, and if a transaction is large or long enough, the
rollback entries will physically hit the the disk before they are discarded
as well.
