The original post: /r/datahoarder by /u/dlangille on 2024-12-31 17:40:54.

I’m making more of my private repos available on GitHub. With that now being the primary source, I need to back that up. Fortunately, I found an easy [for me] solution. I was already using gitea as my git repo at home. I created a ‘pull mirror’ for each repo I want to backup

https://docs.gitea.com/next/usage/repo-mirror#pulling-from-a-remote-repository

That creates a copy in my local gitea instance.

To go one step farther, because this is about backups after all, I did a git pull for each of those repos onto another host:

[17:32 mydev dvl ~/GitHub-backups/FreshPorts] % ls accounts/ docs/ helper_scripts/ periodics/ check_repos/ freshports/ host-init/ vuxml/ daemontools/ freshports-www-offline/ nginx-config/ databases/ git_proc_commit/ packages-import/

I created a new passphrass-less ssh-key pair for use only as a read-only deploy key on those repos. That allows me to use this script to refresh the local working copies on a regular basis:

% cat ~/bin/refresh-GitHub-backups.sh

!/bin/sh
========

REPO\_DIR="/usr/home/dvl/GitHub-backups"

repos=$(find $REPO\_DIR -type d -d 2)

for repo in $repos
do
 cd $repo
 GIT\_SSH\_COMMAND='ssh -i ~/.ssh/read-only-key -o IdentitiesOnly=yes' git pull -q

if [ $? != 0 ] then echo problem in $repo exit 1 fi


done

All of this is store on ZFS filesystems with regular spapshots provided by sanoid. Backups of this directory are stored on another host.

EDIT: grammar