Aller au contenu principal

Backup/restore solutions for storage

CloudNativePG (Postgres)

Backups are natively supported by CloudNativePG using Barman (https://pgbarman.org/), which creates backup files and WAL archives. Backups save the entire cluster, whereas WAL archives are updates of the cluster. Backups are made regularly with a CRON job. To restore a cluster, first, a full backup is restored to bring the cluster back to its exact state at the time of the backup. Then, WAL (Write-Ahead Logging) is replayed to recover transactions applied after the backup, allowing for point-in-time recovery (PITR) to a specific moment, preventing the replay of errors if needed.

Create a MinIO bucket to store backups and WAL archives

Configure scheduled backups in the Helm values

  • Add this part to the Helm values file (if it's not already present) :
backups:
enabled: true

endpointURL: "http://s3.pf-dna--s300.svc.cluster.local:9000" # Adapt to s3 service and namespace : "http://SERVICE_NAME.NAMESPACE.svc.cluster.local:SERVICE_PORT"
# -- Specifies a CA bundle to validate a privately signed certificate.
endpointCA:
# -- Creates a secret with the given value if true, otherwise uses an existing secret.
create: false
name: ""
key: ""
value: ""

destinationPath: ""
provider: s3
s3:
bucket: "BUCKET_NAME" # The name of the bucket previously created
path: "/PATH_NAME" # For example "/v1", or "/v2" if the cluster has been recovered once, etc.
accessKey: "S3_USERNAME"
secretKey: "S3_PASSWORD"
secret:
# -- Whether to create a secret for the backup credentials
create: true
# -- Name of the backup credentials secret
name: ""

wal:
# -- WAL compression method. One of `` (for no compression), `gzip`, `bzip2` or `snappy`.
compression: gzip
# -- Whether to instruct the storage provider to encrypt WAL files. One of `` (use the storage container default), `AES256` or `aws:kms`.
encryption: "" # For now, encryption does not work
# -- Number of WAL files to be archived or restored in parallel.
maxParallel: 1
data:
# -- Data compression method. One of `` (for no compression), `gzip`, `bzip2` or `snappy`.
compression: gzip
# -- Whether to instruct the storage provider to encrypt data files. One of `` (use the storage container default), `AES256` or `aws:kms`.
encryption: "" # For now, encryption does not work
# -- Number of data files to be archived or restored in parallel.
jobs: 2

scheduledBackups:
# A list of scheduled backup, here is an example of a daily backup scheduled at midnight, each backup is also made at the creation of the cluster by default
-
# -- Scheduled backup name
name: daily-backup
# -- Schedule in cron format
schedule: "0 0 0 * * *"
# -- Backup owner reference
backupOwnerReference: self
# -- Backup method, can be `barmanObjectStore` (default) or `volumeSnapshot`
method: barmanObjectStore

Restore a cluster

  • Get the Helm values file the cluster you want to restore.
  • Modify the value "mode" from "standalone" to "recovery".
  • Add this "recovery" part to the cluster Helm values file :
recovery:
method: object_store

## -- Point in time recovery target. Specify one of the following:
pitrTarget:
# -- Time in RFC3339 format
time: ""

##
# -- Backup Recovery Method
backupName: "" # Optional : name of the backup to recover from, default to latest.

##
# -- The original cluster name when used in backups. Also known as serverName.
clusterName: "CLUSTER_NAME"
endpointURL: "http://s3.pf-dna--s300.svc.cluster.local:9000" # Adapt to s3 service and namespace : "http://SERVICE_NAME.NAMESPACE.svc.cluster.local:SERVICE_PORT"
# -- Specifies a CA bundle to validate a privately signed certificate.
endpointCA:
# -- Creates a secret with the given value if true, otherwise uses an existing secret.
create: false
name: ""
key: ""
value: ""
destinationPath: ""
provider: s3
# The same configuration you used in the backups section of the cluster you want to restore
s3:
bucket: "BUCKET_NAME
" # The name of the bucket previously created
path: "/PATH_NAME" # For example "/v1", or "/v2" if the cluster has been recovered once, etc.
accessKey: "S3_USERNAME"
secretKey: "S3_PASSWORD"
secret:
# -- Whether to create a secret for the backup credentials
create: true
# -- Name of the backup credentials secret
name: ""
  • Modify the backups part to create another folder to store the backups of the restored cluster. At least the "path" value must be changed (The operator includes a safety check to ensure a cluster will not overwrite a storage bucket that contained information. A cluster that would overwrite existing storage will remain in state Setting up primary with Pods in an Error state), for example, if path was "/v1", now use "/v2". You can also decide to change the entire backups configuration to write backups and WAL archive in another bucket.
  • Run :
helm upgrade --install HELM_RELEASE_NAME -f HELM_VALUES_FILE --namespace PG-FAT_NAMESPACE --create-namespace https://nexus.technique.artemis/repository/Artemis-Helm/cluster-0.1.0.tgz

MinIO (S3)

Clickhouse

Backups are made with Clickhouse-backup tool (https://github.com/Altinity/clickhouse-backup). A container is deployed in each Clickhouse replica, and runs a REST API on port 7171 where you can create and restore backups.

Create a MinIO bucket to store backups

  • Go to the MinIO console (look at S3 ingress on Kubernete to get the URL, for example for Kosmos-dev : https://s3.kosmos-dev.athea/).
  • Give the name you want for each bucket (metier and technique) METIER_BUCKET_NAME and TECHNIQUE_BUCKET_NAME, and click on "Create Bucket".

Configure backup parameters

Using Helmfile project :

For each Clickhouse cluster (technique or metier), configure s3 storage and options in the environment file (for example environments/kosmos-dev) like :

clickhouse:
technique:
...
backup:
enabled: true
allow_empty_backups: false
full_interval: "24h"
watch_interval: "1h"
backups_to_keep_local: "-1"
backups_to_keep_remote: "0"
s3:
bucket: "TECHNIQUE_BUCKET_NAME"
path: "PATH"
metier:
...
backup:
enabled: true
allow_empty_backups: false
full_interval: "24h"
watch_interval: "1h"
backups_to_keep_local: "-1"
backups_to_keep_remote: "0"
s3:
bucket: "METIER_BUCKET_NAME"
path: "PATH"
  • allow_empty_backups: (default is false) : allow empty backups (if false and nothing changed between 2 backups, no backup is created)
  • backups_to_keep_local (default is -1) : how many latest local backup should be kept, 0 means all created backups will be stored on local disk, -1 means backup will keep after create but will delete after create_remote command.
  • backups_to_keep_remote (default is 0) : how many latest backup should be kept on remote storage, 0 means all uploaded backups will be stored on remote storage.
  • watch_interval (default is 1h) : use only for watch command, incremental backup will create every X hour (for example WATCH_INTERVAL=1h)
  • full_interval (default is 24h) : use only for watch command, full backup will create every X hour (for example FULL_INTERVAL=24h for daily full backup)
  • watch_is_main_process (default is false) : treats watch command as a main api process, if it is stopped unexpectedly, api server is also stopped. Does not stop api server if watch command canceled by the user.

All the available parameters are listed here : https://github.com/Altinity/clickhouse-backup/blob/master/ReadMe.md#configurable-parameters. Some of them are not set in the Helm files. To add a parameter to the backup container, add the corresponding env variables to the container named clickhouse-backup in file apps/clickhouse/clickhouse/templates/clickhouse-installation.yaml (l62).

Backup a cluster using API calls

To make calls to the API, you need to create a port-forward using :

kubectl port-forward CLICKHOUSE_POD_NAME -n CLICKHOUSE_NAMESPACE 7171:YOUR_LOCAL_PORT &

Then you can make API calls. To create a backup, you have 2 ways :

Restore a cluster using API calls

  • List available backups :

    curl -s localhost:YOUR_LOCAL_PORT/backup/list | jq .
  • Get the name of the backup you want to use.

  • If your backup is remote, download it :

    curl -s localhost:YOUR_LOCAL_PORT/backup/download/BACKUP_NAME -X POST | jq .
  • Restore

    curl -s localhost:YOUR_LOCAL_PORT/backup/restore/BACKUP_NAME -X POST | jq .

Opensearch

TiDB (VStore)