LWN.net Logo

CLI Magic: Simple backup is Mirdir (Linux.com)

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 22, 2005 9:30 UTC (Tue) by janpla (guest, #11093)
Parent article: CLI Magic: Simple backup is Mirdir (Linux.com)

What I wonder is - how is it different from, say, (cd /from-dir;tar cf * -)|(cd /to-dir; tar xf -)?


(Log in to post comments)

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 22, 2005 10:25 UTC (Tue) by hppnq (guest, #14462) [Link]

Your example is a quick hack that has nothing to do with synchronization/mirroring and is definitely not extremely suitable for newbies (or anyone else, for that matter). For crying out loud, at least read the article that you are commenting on.

Comparison with TSM anyone?

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 22, 2005 21:25 UTC (Tue) by tjw.org (guest, #20716) [Link]

Your example is a quick hack that has nothing to do with synchronization/mirroring and is definitely not extremely suitable for newbies (or anyone else, for that matter).

That is completely untrue. This command has everything to do with mirroring. This method has been around a very long time and is commonly used and documented. Often it's attributed to Alan Cox in Linux related HOWTO's.

In fact, before "-a" became an option for GNU cp, this was THE way to archive a directory. Perhaps it still is on non-GNU systems that lack the cp archive option (e.g. Mac OSX).

I still use a varaition of this command from time to time.

For example if I want to mirror a directory containing millions of files totalling over a TiB in disk space; I wouldn't want use the following command for the intitial copying:

rsync -azP somehost:/some/dir/ /some/dir/

This is because this command could take days to complete and could make the machine swap to death as it deals with the mind bogglingly huge checksumming. Instead, I would use a very simpistic command that uses the smallest amount of CPU and RAM possible:

ssh somehost "cd /some/dir && tar czf -" | tar zxvf -

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 22, 2005 23:26 UTC (Tue) by hppnq (guest, #14462) [Link]

That is completely untrue. This command has everything to do with mirroring.

Nope, it has to do with archiving or moving data around, like you seem to observe yourself a few lines further on. Look up a definition of mirroring or hey, read the article! And yes, cp -a or -rp is the way to go of course. Consult the tar manpage for a hint of why this is a quick hack not suitable for mirroring, the synopsis should do it.

ssh somehost "cd /some/dir && tar czf -" | tar zxvf -

That makes more sense, yes. How exactly does it relate to mirdir?

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 23, 2005 16:47 UTC (Wed) by tjw.org (guest, #20716) [Link]

Consult the tar manpage for a hint of why this is a quick hack not suitable for mirroring, the synopsis should do it.
Sorry, you'll have to be more specific. I don't see any reason why this is bad practice in the tar man or info pages.
Nope, it has to do with archiving or moving data around, like you seem to observe yourself a few lines further on. Look up a definition of mirroring or hey, read the article!

I did read the article.

I define mirroring a directory as "making an exact copy of the directory". A mirror copy, if you will. To me, this means that all files, permissions, ownership, timestamps, and special files like devices should be exactly the same in both copies.

That makes more sense, yes. How exactly does it relate to mirdir?

It relates to mirdir only because it's a different method of achieving the same end. While mirdir/mirrordir/rsync do some checking to eliminate unnecessary copying, cp or tar can be used to copy everything every time. My point was that there are cases when the latter method is preferrable.

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 23, 2005 21:29 UTC (Wed) by hppnq (guest, #14462) [Link]

Sorry, you'll have to be more specific.

Metadata. Portability. See your definition of mirroring:

To me, this means that all files, permissions, ownership, timestamps, and special files like devices should be exactly the same in both copies.

You forget one quite important aspect: synchronization. This is what sets a mirror apart from an archive or a snapshot, for instance. Of course the concepts overlap: you can make an archive out of a mirror and the other way around.

It relates to mirdir only because it's a different method of achieving the same end.

Your tar | ssh | tar example served to illustrate a powerful variation of the original tar | tar example. But it has nothing to do with mirdir, which is entirely different, and explained in the article in a way that even a newbie can understand it.

Anyway, we're boring everybody's pants off, mine are in the laundry already. The point I wanted to make is: Joe wrote an article intended for newbies about making a simple local backup using a simple command -- and you take offence, because rsync appears to be a superior solution:

I seriously question the experience level of the person who writes the CLI Magic series.

I seriously question your reading abilities and your judgment. Like I said: I think articles like these are quite nice, because they help unleash some of the power that Linux offers to unexperienced users. And obviously you don't know Joe Barr.

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 24, 2005 10:28 UTC (Thu) by mp (subscriber, #5615) [Link]

For example if I want to mirror a directory containing millions of files totalling over a TiB in disk space; I wouldn't want use the following command for the intitial copying:

rsync -azP somehost:/some/dir/ /some/dir/

This is because this command could take days to complete and could make the machine swap to death as it deals with the mind bogglingly huge checksumming.

True, though the --whole-file option of rsync would help with the checksumming problem.

CLI Magic: Simple backup is Mirdir (Linux.com)

Posted Nov 27, 2005 3:03 UTC (Sun) by zblaxell (subscriber, #26385) [Link]

The problem isn't checksumming. By default rsync does no checksumming until it encounters a file that exists on both source and destination with different size or timestamp, unless you add '-c'.

The problem is that rsync *requires* a full list of all files be generated and sent to the client, with both client and server having a copy of this list stored in RAM, before *any* data is transferred. A 32-bit processor has enough address bits for a few million files. After that, you need better software, or more address bits.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds