Author Topic: Linux - projects backup or folder clone?  (Read 3634 times)

0 Members and 2 Guests are viewing this topic.

Online RoGeorgeTopic starter

  • Super Contributor
  • ***
  • Posts: 6779
  • Country: ro
Linux - projects backup or folder clone?
« on: December 09, 2020, 03:13:54 pm »
I have a big folder with many projects (about 100GB now for 90+ various HW/SW/learning/making/etc. projects).  This folder is very valuable to me, and it is now sitting on an 8TB ZFS pool, near other many folders that I don't care much.  I never used automated backups before, mostly to stay independent of 3rd party tools or file formats that might become discontinued in the future.

There are currently 3-4 locations (disks) involved:
  • 160 GB HDD - OS disk for a desktop with Ubuntu LTS, the day to day machine
  • 8 TB HDD - ZFS disk permanently attached to Ubuntu machine
  • 4 TB NAS - a dedicated RAID 5 machine of 4x1GB HDDs that is only rarely powered and available from LAN only.  It is a dedicated Western Digital single board NAS storage, looking like a gray cube, I don't have root access.  The NAS can do Samba or NFS, currently is configured for Samba for backward compatibility with a Windows laptop.
  • 120 GB HDD - windows XP laptop, rarely powered only a few times a year, I might need to copy some projects to the laptop for offline/remote working.

The intention is to have Linux only machines in the future, currently there still are some old Windows XP laptops in the same LAN, powered only a few times a year.

So far I was manually copying the projects folder from the desktop to the NAS, once a few months, and manually diff the file/folders for just in case accidental deletions (accidental deletions that never happened yet).  However, comparing folders to copy is very time consuming, especially after re-organizing the folder structure of some projects in progress.

I almost never delete files, the only changes are in source files, but these are just a few and insignificant in size.  This folder usually grows by adding new projects and new files, new kits, new virtual machines to preserve IDE/toolchains, etc.

I want to preserve/synchronize a few folders in more than one place.
  • Should be a long term solution, with minimal maintenance and no format conversions (some files are already 10+ years old or even more, from former DOS, Windows and Linux machines)
  • Should work standalone (like a file server) - no fancy imaging tools no one will know in 10 years from now (thinking here no Norton Ghost, or alike), no OS dependent/closed solution, etc.
  • Should copy/backup on request only - the NAS is usually unplugged
  • Must be something incremental, the NAS is old and very slow (only 10 MB/s)
  • Must work all locally and offline (definitely no 3rd party cloud storage allowed)
  • Linux oriented, must be free and open software

What to use, or how to do that?
 :-//
« Last Edit: December 09, 2020, 03:19:10 pm by RoGeorge »
 

Offline PKTKS

  • Super Contributor
  • ***
  • Posts: 1766
  • Country: br
Re: Linux - projects backup or folder clone?
« Reply #1 on: December 09, 2020, 03:24:33 pm »
easiest thing of other options...

Setup in your "SOURCE" machine an rsync server.

Setup as many clients as you wish on each location
you want to synch.

Perform the synch at first to populate the clients
and keep them synch  periodically as needed
by running rsync manually or on schedule

hassle free
Paul
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 2267
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #2 on: December 09, 2020, 03:36:24 pm »
Pay attention with rsync, it's not an incremental backup. If you delete something and perform a sync, the deleted item is lost.
You can run rsync with the option --dry-run and it will show you what will be copied and/or deleted (without actually copying or deleting anything).
« Last Edit: December 09, 2020, 03:39:53 pm by Karel »
 

Offline thinkfat

  • Supporter
  • ****
  • Posts: 2160
  • Country: de
  • This is just a hobby I spend too much time on.
    • Matthias' Hackerstübchen
Re: Linux - projects backup or folder clone?
« Reply #3 on: December 09, 2020, 04:53:13 pm »
I'm currently using a Nextcloud instance hosted on a local NAS, which is in turn online-replicated onto a second NAS, which again is regularly backup'd onto an external disk drive.
The local client syncs selected folders between my laptop and the Nextcloud.
Everybody likes gadgets. Until they try to make them.
 

Offline Marco

  • Super Contributor
  • ***
  • Posts: 6971
  • Country: nl
Re: Linux - projects backup or folder clone?
« Reply #4 on: December 09, 2020, 05:41:18 pm »
How important is it that the incremental backups stay small? Rsnapshot/back in time/timeshift will do incremental backups at the file level, the advantage being that a "snapshot" is just a standard directory tree with links except for the changed files. Duplicity/borg/restic will create much smaller incremental backups, but requires more effort and time to restore.

If you used a new OpenZFS NAS, you could use ZFS snapshots and send/receive.
« Last Edit: December 09, 2020, 07:10:29 pm by Marco »
 

Online RoGeorgeTopic starter

  • Super Contributor
  • ***
  • Posts: 6779
  • Country: ro
Re: Linux - projects backup or folder clone?
« Reply #5 on: December 09, 2020, 08:33:31 pm »
Thank you all for the hints, so far it was all very helpful to me, yet I will comment only about what I don't like at each method, so please don't get mad at me for talking only about the bad parts:

1. rsync - got very bad memories from the first time I tried to use it  (used the wrong combination of switches and ended up deleting the file server according to my almost empty local folder  ;D  ).  All my fault, of course, but since then I avoid it.

If I understood correctly, rsync does not keep an incremental backup, so it can not restore deleted or former version of files.  Is this true?


2. Nextcloud - took a look at it, and it's good that it's free and open source, but I never need a web interface to my files, never open the projects other than from a mounted location on a local desktop.  I am trying to stay away of 3rd party turnkey solutions because they tend to lock the user into a certain environment then they become discontinued.


3. Rsnapshot/back in time/timeshift will read about them, too.  Would be very nice if the backup stays reasonably small.  For example 1 TB occupied instead of 0.1 TB current size for a "write only" like style of backup, so I can access files deleted last year, would be great.

4. OpenZFS for the NAS and use the ZFS send/receive seems very tempting, yet the NAS is a "WD ShareSpace" dedicated single board computer, and most probably not a 386 architecture):  http://products.wdc.com/library/UM/ENG/4779-705006.pdf

I am not sure if is it possible to install OpenZFS on the WD ShareSpace hardware, and I suspect the hardware doesn't have much RAM either.




Meanwhile I thought more about the requirements, and kept only the most important two:
  • - current version of the files must be available like a normal folder.  No tools, or databases, or installations required to read them.  Just mounting (or mapping) the NAS.
  • - it must be able to restore any file from any time including the deleted or moved ones (like a "write only" incremental backup, I almost never delete and only rarely edit small source files, or rarely reorganize/move around the same folders).  This way I can save to NAS the current version without checking I didn't accidentaly deleted something since the last save to NAS.

Offline olkipukki

  • Frequent Contributor
  • **
  • Posts: 790
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #6 on: December 09, 2020, 11:30:26 pm »

What to use, or how to do that?
 :-//

classik tools like tar and your own scripting?!   ^-^
or...

Did you check rdiff-backup?

https://rdiff-backup.net/index.html



 

Offline retiredfeline

  • Frequent Contributor
  • **
  • Posts: 572
  • Country: au
Re: Linux - projects backup or folder clone?
« Reply #7 on: December 10, 2020, 12:23:19 am »
Another vote for rsnapshot. A caveat is that if you do large reorganisations like mass renames, moving directories, changing ownership, permissions and so forth, you may want to apply those changes to the latest snapshot, otherwise the next differential snapshot will be large.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4529
  • Country: nz
Re: Linux - projects backup or folder clone?
« Reply #8 on: December 10, 2020, 01:26:44 am »
Pay attention with rsync, it's not an incremental backup. If you delete something and perform a sync, the deleted item is lost.
You can run rsync with the option --dry-run and it will show you what will be copied and/or deleted (without actually copying or deleting anything).

Or, there's an option to not delete files at the far end that have disappeared from the local end.

You really really want to figure out (or get from someone else) the best set of options for your use, and put that in a script, not try to remember it.

Or use rdiff-backup, which also gives you dated snapshots you can restore from.

I was using rdiff-backup heavily 15 years ago, but now I've partitioned my data into small and/or volatile stuff that I "back up" using git, and big non-versioned media files that I use rsync without delete, and with trusting timestamps to avoid diffing the whole thing.

One problem with rdiff-backup is that it uses hard links to files that haven't changed to make the incremental backups, but as you can't make hard links to directories (on Linux file systems -- you can on HFS+) an entire tree that hasn't changed has to have every file inside individually linked every time. Apple added hard links to directories in their filesystem at the same time they released the first version of Time Machine.
 

Offline lordium

  • Regular Contributor
  • *
  • Posts: 62
  • Country: cn
Re: Linux - projects backup or folder clone?
« Reply #9 on: December 10, 2020, 01:53:13 am »
I use https://syncthing.net/ for this kind of thing. It does support different kinds of file versioning and patterns for ignoring unnecessary temporary files. I use a Linux server that is always up, a laptop and a desktop (both windows and Linux). They always automagically sync my folders between each other. It also has a nice GUI you can use. Haven't had any problems with this setup yet.
 

Offline sleemanj

  • Super Contributor
  • ***
  • Posts: 3047
  • Country: nz
  • Professional tightwad.
    • The electronics hobby components I sell.
Re: Linux - projects backup or folder clone?
« Reply #10 on: December 10, 2020, 02:01:20 am »
Borg
https://www.borgbackup.org/

Especially if there is any level of duplication in those projects Borg will save you a crapton of space in your backup.  Nice and easy to use to recover anything as can just mount the archive as a filesystem. 
~~~
EEVBlog Members - get yourself 10% discount off all my electronic components for sale just use the Buy Direct links and use Coupon Code "eevblog" during checkout.  Shipping from New Zealand, international orders welcome :-)
 

Offline Ed.Kloonk

  • Super Contributor
  • ***
  • Posts: 4000
  • Country: au
  • Cat video aficionado
Re: Linux - projects backup or folder clone?
« Reply #11 on: December 10, 2020, 03:57:58 am »
Thank you all for the hints, so far it was all very helpful to me, yet I will comment only about what I don't like at each method, so please don't get mad at me for talking only about the bad parts:

1. rsync - got very bad memories from the first time I tried to use it  (used the wrong combination of switches and ended up deleting the file server according to my almost empty local folder  ;D  ).  All my fault, of course, but since then I avoid it.

If I understood correctly, rsync does not keep an incremental backup, so it can not restore deleted or former version of files.  Is this true?


Incremental backup is quite possible with RSYNC. With a little bit of scripting and patience, you can use the cp -al command to copy and link the files as a snapshot. Bear in mind tho, have to be aware that with sym-linking, if you change the file metadata, etc permissions, those changes reflect on all the snapshots bearing the linked file you changed.

Whilst I prefer to roll up my sleeves, I think the plethora of backup/timeshift solutions that are available now is fantastic. It's never been easier to keep a Linux file system safe and secure.

iratus parum formica
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4529
  • Country: nz
Re: Linux - projects backup or folder clone?
« Reply #12 on: December 13, 2020, 01:32:36 pm »
I guess I must be the only one using Git for this. I have a gogs server running on the LAN, and the server is backed up locally using duplicity.

Not the only.

I was using rdiff-backup heavily 15 years ago, but now I've partitioned my data into small and/or volatile stuff that I "back up" using git, and big non-versioned media files that I use rsync without delete, and with trusting timestamps to avoid diffing the whole thing.
 

Offline Doctorandus_P

  • Super Contributor
  • ***
  • Posts: 3890
  • Country: nl
Re: Linux - projects backup or folder clone?
« Reply #13 on: December 13, 2020, 04:21:25 pm »
The simplest would be to add an extra HDD in your PC and use it only for Backups. You can put a relay into the power cable in such a way that you turn the power relay on with a push button, it keeps itself on, and turns off when the PC turns off. (Or some fancier switch, maybe a uC on USB or I2C (Every PC has I2C internal, also available on PCI bus and VGA connectors). If you can turn the HDD on via a script, that helps with automating backups.

HDD's can also be configured to stay off when they get powered, of put to sleep after some time of inactivity, but personally I prefer a hardware switch.

This is another possibility, just add your own HDD.
https://www.hardkernel.com/shop/odroid-hc2-home-cloud-two/
 

Offline bobcat2000

  • Regular Contributor
  • *
  • Posts: 218
  • Country: us
Re: Linux - projects backup or folder clone?
« Reply #14 on: December 14, 2020, 12:06:02 am »
I used to sell these to my clients.  They seemed very happy with the service.  The appliance virtualizes your server hourly (or even minute by minute).  In the event your server fails, the backup appliance starts up the VM right away.  There is virtually no downtime.  The thing will start the backup VM daily and send you a screenshot to verify the VM can boot ok.  You can restore individual files from the VM from different days too.  The VM images are sent to the cloud for backup.

https://www.datto.com/
https://www.unitrends.com/
 

Offline olkipukki

  • Frequent Contributor
  • **
  • Posts: 790
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #15 on: December 14, 2020, 11:07:31 am »
I guess I must be the only one using Git for this.

Are you using Git for a binary content too?
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4529
  • Country: nz
Re: Linux - projects backup or folder clone?
« Reply #16 on: December 14, 2020, 11:52:16 am »
I guess I must be the only one using Git for this.

Are you using Git for a binary content too?

Git treats all content as binary.

Its deltas between files are based on byte ranges, not on something like "lines".
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28058
  • Country: nl
    • NCT Developments
Re: Linux - projects backup or folder clone?
« Reply #17 on: December 15, 2020, 12:02:52 am »
I guess I must be the only one using Git for this. I have a gogs server running on the LAN, and the server is backed up locally using duplicity.
I use GIT as well for binary files (like FPGA images that go together with a certain version of the software). If versioning is important then GIT also allows to at least make sense of which version is what instead of providing only a change date.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline olkipukki

  • Frequent Contributor
  • **
  • Posts: 790
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #18 on: December 15, 2020, 08:55:53 am »
Why not?
How big is your repo and what is a largest binary file there?

Are you suggesting that OP put "about 100GB now" folder under git supervision... or should OP think hard to restructure "90+ various HW/SW/learning/making/etc. projects" into 90+ repos? 
 

Offline olkipukki

  • Frequent Contributor
  • **
  • Posts: 790
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #19 on: December 15, 2020, 09:06:15 am »
Git treats all content as binary.
Its deltas between files are based on byte ranges, not on something like "lines".
Sounds great, but how it will help to versioning such as 100MB+ digital media, 1GB+ 3D models or 50GB+ VM images (of course, eveything in native binary formats)?

Why would they come up with git binary on steroids (aka git lfs) to somehow address that?
 

Offline olkipukki

  • Frequent Contributor
  • **
  • Posts: 790
  • Country: 00
Re: Linux - projects backup or folder clone?
« Reply #20 on: December 15, 2020, 09:42:04 am »
I use GIT as well for binary files (like FPGA images that go together with a certain version of the software). If versioning is important then GIT also allows to at least make sense of which version is what instead of providing only a change date.
That's not a crime to bend and use git as an artifact repository, but it never been or will be like that  :P
 

Offline bd139

  • Super Contributor
  • ***
  • Posts: 23096
  • Country: gb
Re: Linux - projects backup or folder clone?
« Reply #21 on: December 15, 2020, 01:30:20 pm »
OP's situation is fine.

Use rdiff-backup. It's the only thing I have found that actually handles large inode counts and large files reliably. I have a 14TiB volume with several tens of million files in it ranging a few bytes to a few hundred Mb. It keeps the last snapshot as a completely readable directory tree with backwards increments in time. It's similar to Time Machine on Macs (which I use there).

https://rdiff-backup.net

Things I tried:

1. rsync on it's own. No incrementals, serious deletion handling problems  :palm:. Fine for mirroring only!
2. git. Absolutely useless for large file counts and large files. O(wtf) scalability problems everywhere  :palm:
3. duplicity. Actually ran out of RAM on a machine with 256Gb of RAM  :palm:. I would never touch this again - it's a complete honk of shit.

Edit: various block level backups and network solutions like borg - forget them for simple volume mirroring!

My own personal stuff is all on macOS with Time Machine as a backup. My Linux VMs are all code-only so everything is pushed to github and then pulled back down to the mac.
« Last Edit: December 15, 2020, 01:36:29 pm by bd139 »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4529
  • Country: nz
Re: Linux - projects backup or folder clone?
« Reply #22 on: December 15, 2020, 09:02:49 pm »
Git treats all content as binary.
Its deltas between files are based on byte ranges, not on something like "lines".
Sounds great, but how it will help to versioning such as 100MB+ digital media, 1GB+ 3D models or 50GB+ VM images (of course, eveything in native binary formats)?

"digital media" and "3d models" are rather vague terms. What kind of files you have *matters*. Are they diffable in that what the user regards as small changes result in small deltas?

VM images and databases are just fine with git. No matter what their size, if the user edits a few files or modifies a few database records then git will add the new version to its repo using a small delta.

I don't know of any practical file size limitation with git. I just tried creating a repo with a few VM images I have lying around. On a Mac Mini it took about a minute per 5 GB of VM image to initially add it to the repo. But the repo came out 1/8th the size of the VM image. I didn't see git use more than 1.5 GB of RAM in the process (and in general I haven't seen git use large amounts of RAM when processing large files).

The only limitation is that you need to have the git repo accessible on a mounted filesystem, and git assumes that's fast local storage, and the full history of all versions is stored in that local repo, so it eventually becomes big.

But, as long as the things you are adding are diffable, the size of the repo is much much smaller than the total size of all the versions you add to it.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf