Skip to main content

Design of tiCrypt backup

· 8 min read
Alin Dobra

Core principles

The main idea is to introduce two notions:

  • full backup/checkpoint is a complete backup of the resources. Recovering the resource only requires the information in such a backup.
  • incremental backup is a "delta" change from previous incremental backups or checkpoints. All the incremental backups between the recovery point and a checkpoint are required to recover the resources covered by the backup.
  • backup strategies specify what type of resource to backup and how often.
  • backup domains specify specific resources to back up together with the backup strategy. All the resources covered by a backup domain are backed up together.
  • backup: an incremental or full backup of a specific backup domain at a specific point in time.
  • external backup server: server with SFTP connectivity that holds the backup files externally.
  • backup solution: Software running on the external backup server that actually backs up the files on tapes/cloud.

Except for SFTP interface availability, there is no requirement on the external backup server. In particular, any operating system and backup solution can be used.

Backup strategies

  • name: displayable name for the backup strategy
  • id: internal ID
  • incrementalInterval: the interval between incremental backups in seconds.
  • checkpointInterval: the interval between full backups/checkpoints in seconds.
  • teamStrategy: the strategy for the teams
  • projectStrategy: the strategy for the projects
  • userStrategy: the strategy for the users

The strategies are maps from names to boolean (true/false). The possible names and meanings are:

  • userVault [team,project]: should we backup the vault of the users part of the team/project?
  • userForms [team,project]: should we backup the forms of the users in the team/project?
  • groups [user, team, project]: should we backup the Vault related to groups of the user/team/project?
  • drives [user,team,project]: should be backup the drives (marked as backup) of the user/team/project?
  • vault [user]: backup user vault
  • forms [user,project]: backup user/project forms

Backup domains

Backup domains are simply collections of users/projects/teams that need to be backed up together.

The backup domains can be set up at sub-admin+ level and do not require full admin access. Only resources that the sub-admin has access to can be included in a backup domain. Specifically:

  • if a sub-admin manages a team, the team can be added
  • if a sub-admin manages a project, the project can be added
  • if a user is in a team or a project managed by a sub-admin, the user can be added

At a high level, the backup domains specify:

  • name: displayable name of the domain
  • owner: Who owns the domain
  • managers: Other users that can manage this domain
  • sftp: SFTPSpec object
  • strategy: ID of the backup strategy to use

The SFTPSpec object has the following structure:

interface SFTPSpec {
server: string, // the name or IP of the sftp server
port: number, // the port, default 22
user: string, // sftp user name
directory: string, // the directory where backups are stored
}

The objects in the domain to backup are specified using a "membership" model. Specifically, a CRUD interface that allows listing, adding, and deleting objects to a domain is needed. Listing and changing the membership can be done by specifying:

interface BackupDomainSpec {
domainID: string, // the ID of the domain
type: "user" | "team" | "project", // type of the object
objID: string, // Object ID
}

Backups

Backups can be used for two different reasons:

  • disaster recovery: In situations when the system needs to be re-created from scratch (disastrous failure of storage, accidental system deletion, system cloning)
  • time machine: Recovering the state of a resource from the past, e.g. state of a drive as of 3 days ago, recover deleted file from 2 weeks ago, etc.

Some important principles are:

  • security: the backup is encrypted to the full extent possible. Neither the external backup server or backup solution used should be trusted.
  • minimize traffic: only the files strictly needed to perform recovery from backup should be required. This ensures minimal costs for recovery

Creating a backup

A backup consists of:

  1. Entries in the ticrypt-backup service database indicating information on what is backed up and where
  2. Files placed on the tiCrypt backend server, later transferred to the remote sftp backup server indicated by the backup domain metadata.

Creating a backup will involve the following steps:

  1. A backup directory is created to host the metadata and files for the specific backup of the specific backup domain
  2. Server computes which resources (files, forms, drives) are backed up. For "full" backups, all such objects permitted by the backup strategies are listed, for "incremental" backups, only objects that changed in the backup period are listed
  3. Server prepares a "metadata-file" containing all the relevant backup information and the list of backed up files with the corresponding auxiliary files
  4. For each type of object, auxiliary files are copied into the "backup directory"
  • files: The file chunks
  • forms: the form entries into an SQLite database
  • drives: the .qcow drive image (full backup), or the image snapshot (incremental backup)
  1. Metadata to be able to recover the information about the resources is added to the metadata file. Specifically:
  • files: file metadata and all file keys
  • forms: form metadata and all form keys
  • drives: drive metadata and all drive keys
  1. The backup information on this specific backup is stored into ticrypt-backup service's database

Recovering a resource from the backup

This is primarily performed to recover a previous version of a resource, i.e. time machine functionality.

To recover a resource from the backup, the following steps are taken:

  1. Recover the metadata file from the backend server or remote SFTP server
  2. Determine other backup metadata files (incremental backups need access to all the metadata files of all incremental backups they depend on)
  3. Compute all other auxiliary files needed
  4. Recover the content of the auxiliary files on the system backend (via SFTP if needed)
  5. Recover the state of the resource:
  • files: recover the contend of missing chunks and the metadata
  • forms: recover the content of the SQLite database and metadata
  • drives: recover the state of the drive, by "merging" snapshots on top of the most recent checkpoint.

Recovering the entire state from the backup

This is primarily performed as a form of disaster recovery. Specifically, to recover all resources covered by the backup.

The process is the same as above but it spans all resources covered by a backup.

Caching backup files for better performance

In order to speed up recovery, the backup files can be cached at multiple levels:

  • backend server: can store, without guarantee of availability, any of the backup files/directories.
  • external backup server: same as above thus removing the need to recover from the actual backup solution.

It is more valuable to cache metadata rather than auxiliary files. The metadata is required even to figure out what can be recovered.

User stories

  • As a sub-admin+, I want to list existing backup strategies
  • As an admin+, I want to be able to add/delete/modify backup strategies. This requires admin+ level since it is based on global system policy/requirements. It is also intended to limit the choices for backups to remove decision overload.
  • AS a sub-admin+, I want to list backup domains. Depending on role and visibility, only some of the domains are visible here.
  • As a sub-admin+, I want to be able to add/delete/modify backup domains
  • As a sub-admin+, I want to add/delete other sub-admins to the list of managers for specific backup domains
  • As a sub-admin+, I want to list objects covered by backup domains. Depending on role and visibility, only some of the objects are visible here.
  • As a sub-amin+, I want to add/delete objects to a backup domain
  • As a sub-admin+, I want to list backups associated with a backup domain
  • As a sub-admin+, I want to check the status of an ongoing backup
  • As a sub-admin+, I want to initiate a backup (incremental or full) immediately on a specific domain. This might be needed to allow backups before dangerous operations. It probably requires special permission.
  • As a sub-admin+, I want to list resources that can be recovered by a specific backup
  • As a sub-admin+, I want to list backup files needed to recover a resource
  • As a sub-admin+, I want to list backup files needed to recover the full state of a backup domain
  • As a sub-admin+, I want to recover a specific resource, possibly in a "copy" of the original resource. This requires all files listed as backup files to be available on either the tiCrypt backend or the remote SFTP server.
  • As a sub-admin+, I want to recover a complete domain. This requires all files listed as backup files to be available on either the tiCrypt backend or the remote SFTP server.