virtualization
virtualization
Posted Mar 27, 2014 18:58 UTC (Thu) by kleptog (subscriber, #1183)In reply to: virtualization by marcH
Parent article: PostgreSQL pain points
> Well, you can't use that on a live database anyway, so this point looks moot. Unless maybe you rely on a filesystem with snapshotting which is... not far from duplicating a database feature! Same pattern gain.
You can. It's useful for both backup and replication. Basically you can use rsync to quickly update your backup image. And then you take a copy of the WAL logs. The combination gives you a backup. If you have a snapshotting filesystem you can indeed achieve similar effects.
> > * Can't keep up with hardware advances in a timely fashion.
> > * Clobbering all other IO-using software on the same machine.
> Sorry I don't get these two. Care to elaborate
For the first, consider the effects the rise of SSD is having on the Linux VFS. That would need to be replicated in the database. For the second, as a userspace program you don't have a good view of what the rest of the system is doing, hence you might be interfering with other processes. The kernel has the overview.
It's a feature that a database doesn't assume it's the only program on a machine.
Posted Mar 27, 2014 19:32 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (2 responses)
> For the second, as a userspace program you don't have a good view of what the rest of the system is doing, hence you might be interfering with other processes. The kernel has the overview.
How is the raw partition approach worse here? I would intuitively think it makes things better: less sharing.
Anyway: any database of serious size runs on dedicated or practically dedicated hardware, doesn't it?
Posted Mar 28, 2014 22:24 UTC (Fri)
by kleptog (subscriber, #1183)
[Link]
> > For the second, as a userspace program you don't have a good view of what the rest of the system is doing, hence you might be interfering with other processes. The kernel has the overview.
> How is the raw partition approach worse here? I would intuitively think it makes things better: less sharing.
I think it depends on what your goals are. If your goal is to make the absolutely fastest database server possible, then you'd probably want to use raw access on a system with nothing else running.
If your goal is to make a database server that is broadly useful, runs efficiently on a wide variety of systems then asking the kernel to do its job is the better idea.
PostgreSQL tends to the latter. The gains you can get from raw access are simply not worth the effort and would make PostgreSQL much harder to deploy in many situations. A database server that only works well when it's got the machine to itself is a PITA in many situations.
Posted Mar 29, 2014 3:37 UTC (Sat)
by fandingo (guest, #67019)
[Link]
I'm not sure a database should be implementing operations necessary for ATA TRIM.
Posted Mar 27, 2014 19:36 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (1 responses)
Heh, that was missing.
I am still not convinced that rsync is the ultimate database backup tool. As much as I love rsync it surely does not have the patented exclusivity of incremental copying/backup techniques.
Posted Apr 14, 2014 7:41 UTC (Mon)
by MortenSickel (subscriber, #3238)
[Link]
On the other hand, for any database of a certain size and importance, you probably want to have a separate partition for the database files so I could be possible to advice using a certain file system with some certain parameters to get optimal performance.
virtualization
virtualization
virtualization
virtualization
virtualization
So, no raw partitions, please - unless rsync and other file management tools get patched to read them... :-P