Long-term project archival

Posted March 6, 2022 by Alin Dobra and Thomas Samant ‐ 5 min read

In this blog we lay out a plan on how to support long-term project archival in tiCrypt.

In this blog, we lay out a plan to support long-term project archival in tiCrypt.

Motivation

As hosted projects reach the end of life in tiCrypt, clear plans need to be made to archive all project assets such as files, encrypted drives, and project members. Ideally, strong support exists in tiCrypt to allow seamless archival of such projects to free resources but maintain project security.

To meet the usability and security goal of tiCrypt, the following desirable properties need to be completed by the archival process:

  • All the project state needs to be archived. This includes the files, groups, encrypted drives, and membership information.
  • Direct support for archiving needs to be provided in both the tiCrypt frontend and backend.
  • Mechanisms to query and un-archive a project are needed.
  • A long-term (10+ years) solution for the project asset decryption needs to be supported. Such a solution needs to consider that project members might no longer be associated with the organization when the project is un-archived. Several questions are essential:
    • Given those original users might not be available to decrypt the project assets, how can the decryption take place?
  • How to protect the decryption mechanism from attacks. Specifically, how can the strong protection tiCrypt guarantees for active projects be extended to archived projects?
    • Archived projects need to be stored on cheap, “cold” storage solutions such as magnetic tapes. How does the un-archiving process work operationally?

Overview of Feature

There are two aspects of the technical solution needed that need careful consideration:

  1. How to allow project asset decryption in the future without reliance on current users?

    • The key idea is to introduce a “project user” (at the project creation or just before archiving) that can decrypt all the project assets.
    • This project user should have a carefully controlled private key to control under what circumstances decryption can take place
    • The escrow mechanism is perfect for recovering the project user key since it requires a multi-party agreement. No one user will be able to decrypt project files unilaterally; an escrowed key recovery must take place.
  2. How to “package” project assets in a reasonably efficient manner and easy to store externally?

    • The backup mechanism already contains most of the required features.
    • Performing a “full checkpoint” for the backup will ensure that the complete state of all assets is saved.
    • The backup mechanism is already capable of recovering assets.
    • Data security is already guaranteed for the backup mechanism.
    • Data backup produces files archived with any existing mechanism, such as sftp transfers and file copying. This will ensure smooth integration with other archival technologies.

The Project User

Much like a regular user, the project user will have a private-public key pair generated upon project creation or later by one of the project administrators.

The public key of the project user, like all user public keys, will be stored in the database maintained by the tiCrypt backend and made available to all project members. When a project asset is created (or later shared with the project user), the asset key is shared with the project user using the same mechanism assets are shared with regular users (encrypt asset key with project user’s public key and store in the database). Only knowledge of the public key is needed for such cryptographic key exchange.

The private key of the project user is dealt with differently than the user’s private keys. Instead of being handed over to some user and being protected by a password, the escrow mechanism will be used to ensure future recovery of the private key at the time the project un-archiving happens. Specifically, this means that the project user’s private key cannot be recovered past the creation point without using the multi-party key recovery mechanism implemented by the de-escrow procedure. Nobody, including project managers, will be able to recover the project user’s private key outside the de-escrowing mechanism. Should a user with access to all project assets participate in the project un-archiving, there will be no need to recover the project user’s private key (the files can be decrypted by the user using their private key).

The archival project mechanism will ensure that all assets have keys for the project user before the project archival completes to provide future decryption capabilities. This way, recovering the project user’s key will suffice in the future un-archiving process. To ensure the future decryption of project assets, the project users need to be permanently kept in good standing with the escrow mechanism. Specifically, as escrow users get added and removed from escrow groups, the project user’s private key parts for all archived projects need to be “handed over” to new escrow users using the same mechanism employed for regular user keys.

Archival Process Mechanics

Project archival will be distinct from the tiCrypt backup mechanism but will share a significant number of mechanisms. Specifically, the archival process will:

  1. Create a directory structure similar to the backup mechanism.
  2. The encrypted drives will be fully archived (the equivalent of the backup checkpoint).
  3. Files belonging to the project will be archived.
  4. All project metadata will be archived in an SQLite database similar to the backup mechanism. This includes:
  • file metadata and directory structure
  • group metadata
  • member user metadata
  • drive metadata
  1. The database will be used to “query” the archive content without the entire archive data being present; this is a similar mechanism to the existing backup.

Timeline

We expect to have the project archival feature released into production by December 2022.