This document describes the security architecture, cryptographic design, and limitations of git-cas's content-addressable storage system with optional encryption.
- Threat Model
- Cryptographic Design
- Key Handling
- Encryption Flow
- Decryption Flow
- Chunk Digest Verification
- Limitations
- Git Object Immutability
- Error Codes for Security Operations
git-cas provides defense against the following threat scenarios:
-
At-rest confidentiality: When encryption is enabled, stored content is protected from unauthorized reading by anyone who gains access to the Git object database without the encryption key.
-
Data integrity: All stored content (encrypted or not) is protected by SHA-256 digests per chunk. Any corruption, tampering, or bit-rot is detected during restore or integrity verification.
-
Authentication of ciphertext: AES-256-GCM's built-in authentication tag ensures that encrypted data has not been modified or tampered with. Any modification to ciphertext will cause decryption to fail.
git-cas does NOT provide protection in the following scenarios:
-
Key management: git-cas does not store, manage, or rotate encryption keys. Key storage and lifecycle management are entirely the caller's responsibility.
-
Access control: git-cas does not implement access control lists or authorization policies. If an attacker can access the Git repository and has the encryption key, they can read all content.
-
Side-channel attacks: No protection against timing attacks, power analysis, or other side-channel attacks on the cryptographic operations.
-
Memory safety: Decryption of encrypted content loads the entire ciphertext into memory. No protection against memory dumps or swap file exposure.
-
Key recovery: If an encryption key is lost, there is no key recovery mechanism. Encrypted data becomes permanently inaccessible.
-
Metadata privacy: The following metadata is NOT encrypted:
- Manifest structure (slug, filename, chunk count)
- Chunk sizes and indices
- SHA-256 digests of encrypted chunks
- Git tree and blob object IDs
-
Deletion guarantees: Logical deletion from the manifest does not physically remove data from Git's object database. See Git Object Immutability.
-
Concurrent key rotation: There is no support for re-encrypting content with a different key while maintaining availability.
git-cas uses AES-256-GCM (Galois/Counter Mode) for authenticated encryption:
- Algorithm:
aes-256-gcmvia runtime-specific adapters (Node.jsnode:crypto, BunCryptoHasher+node:crypto, Deno/Webcrypto.subtle) - Key size: 256 bits (32 bytes)
- Nonce size: 96 bits (12 bytes), cryptographically random
- Authentication tag: 128 bits (16 bytes)
AES-256-GCM was chosen because:
- Authenticated Encryption with Associated Data (AEAD): Provides both confidentiality and integrity/authenticity in a single operation.
- Nonce-based: Does not require unique per-message keys, only unique nonces.
- Industry standard: Widely deployed, well-studied, and supported by hardware acceleration on modern CPUs.
- Streaming-friendly: GCM mode allows incremental encryption without padding requirements.
Each encryption operation generates a fresh 96-bit (12-byte) nonce using crypto.randomBytes(12):
- Uniqueness requirement: The same key must NEVER be used with the same nonce twice.
- Random generation: git-cas uses cryptographically secure random number generation from Node.js's
crypto.randomBytes(), which sources from the OS entropy pool. - Collision probability: With 96-bit random nonces, the probability of collision is negligible for practical use cases (< 2^48 encryptions with the same key).
CRITICAL: Callers must NOT reuse encryption keys across a large number of operations (approaching 2^32 encryptions with a single key). While collision is unlikely, best practice is to rotate keys periodically.
After encryption completes, AES-256-GCM produces a 128-bit authentication tag:
- The tag is stored in the manifest's
encryption.tagfield (base64-encoded). - During decryption, the tag is verified by
createDecipheriv()viasetAuthTag(). - If the ciphertext or tag has been modified,
decipher.final()will throw an error.
The encryption layer wraps the chunking layer:
[Plain source stream] → [Encrypt stream] → [Chunk into 256KB blocks] → [Store as Git blobs]
This means:
- Encrypted chunks are not individually authenticated: The entire ciphertext is authenticated as a single unit by the GCM tag.
- Chunk digests are computed on ciphertext: The SHA-256 digest stored in each chunk entry is the hash of the encrypted data, not the plaintext.
- Chunking is deterministic: Given the same plaintext and key/nonce, the encrypted chunks will be identical (because nonce is fixed at encryption time).
git-cas does not store encryption keys. All key management responsibilities fall on the caller:
- Key generation: The caller must generate cryptographically secure 256-bit (32-byte) keys.
- Key storage: The caller must securely store keys (e.g., in environment variables, key management systems, hardware security modules).
- Key distribution: If keys need to be shared across systems, the caller must implement secure key distribution.
- Key rotation: The caller must implement key rotation policies. git-cas does not support re-encrypting content with a new key.
git-cas validates keys before use:
_validateKey(key) {
if (!Buffer.isBuffer(key) && !(key instanceof Uint8Array)) {
throw new CasError(
'Encryption key must be a Buffer or Uint8Array',
'INVALID_KEY_TYPE',
);
}
if (key.length !== 32) {
throw new CasError(
`Encryption key must be 32 bytes, got ${key.length}`,
'INVALID_KEY_LENGTH',
{ expected: 32, actual: key.length },
);
}
}Accepted types: Buffer or Uint8Array
Required length: Exactly 32 bytes (256 bits)
If validation fails:
- INVALID_KEY_TYPE: Key is not a Buffer or Uint8Array
- INVALID_KEY_LENGTH: Key is not 32 bytes
- Generate keys using a CSPRNG: Use
crypto.randomBytes(32)or equivalent. - Never hardcode keys: Store keys in secure configuration, not in source code.
- Use unique keys per project/environment: Do not reuse the same key across different repositories or environments.
- Rotate keys periodically: Establish a key rotation policy (e.g., every 90 days).
- Secure key backups: If keys are backed up, encrypt the backup with a separate master key.
When storing content with encryption enabled:
- Caller provides
source(async iterable of Buffers),slug,filename, andencryptionKey. - git-cas validates the key.
- git-cas creates a streaming encryption context with a random nonce.
- The source stream is encrypted incrementally.
- Encrypted chunks are buffered to 256KB boundaries.
- Each 256KB encrypted chunk is hashed (SHA-256) and written as a Git blob.
- After encryption completes, the GCM authentication tag is retrieved.
- Encryption metadata (algorithm, nonce, tag) is stored in the manifest.
Step 1: Key Validation
if (encryptionKey) {
this._validateKey(encryptionKey);
}- If
encryptionKeyis provided, validate it is a 32-byte Buffer/Uint8Array. - If validation fails, throw
CasErrorwith codeINVALID_KEY_TYPEorINVALID_KEY_LENGTH.
Step 2: Initialize Manifest Data
const manifestData = {
slug,
filename,
size: 0,
chunks: [],
};Step 3: Create Encryption Stream
const { encrypt, finalize } = this.crypto.createEncryptionStream(encryptionKey);createEncryptionStream()generates a 12-byte random nonce.- Creates an
aes-256-gcmcipher with the key and nonce. - Returns:
encrypt: an async generator function that yields encrypted chunksfinalize: a function that returns encryption metadata after encryption completes
Step 4: Chunk and Store Encrypted Stream
await this._chunkAndStore(encrypt(source), manifestData);- The
encrypt(source)async generator reads from the source, encrypts data incrementally, and yields encrypted buffers. _chunkAndStore()buffers encrypted data to 256KB boundaries.- Each 256KB chunk is SHA-256 hashed and written as a Git blob.
- Chunk metadata (index, size, digest, blob OID) is appended to
manifestData.chunks.
Step 5: Finalize Encryption Metadata
manifestData.encryption = finalize();finalize()retrieves the GCM authentication tag.- Returns an object:
{ algorithm: 'aes-256-gcm', nonce: '<base64-encoded-nonce>', tag: '<base64-encoded-tag>', encrypted: true, }
- This metadata is stored in the manifest's
encryptionfield.
Step 6: Create Manifest
const manifest = new Manifest(manifestData);- Streaming encryption: Data is encrypted incrementally. The entire plaintext does NOT need to fit in memory during encryption.
- Deterministic chunking: For the same plaintext and key/nonce, the chunk boundaries and digests are deterministic.
- No plaintext leakage: The plaintext source is never written to disk. Only encrypted chunks are persisted.
When restoring content with encryption:
- Caller provides
manifestandencryptionKey. - git-cas validates the key.
- git-cas reads all chunk blobs from Git.
- Each chunk's SHA-256 digest is verified against the stored digest in the manifest.
- All encrypted chunks are concatenated into a single ciphertext buffer.
- The ciphertext is decrypted using AES-256-GCM with the stored nonce and tag.
- If the tag verification fails, decryption throws an integrity error.
- The plaintext buffer is returned to the caller.
Step 1: Key Validation
if (encryptionKey) {
this._validateKey(encryptionKey);
}Step 2: Check if Key is Required
if (manifest.encryption?.encrypted && !encryptionKey) {
throw new CasError(
'Encryption key required to restore encrypted content',
'MISSING_KEY',
);
}- If the manifest indicates content is encrypted but no key is provided, throw
MISSING_KEY.
Step 3: Read and Verify Chunks
const chunks = await this._readAndVerifyChunks(manifest.chunks);- For each chunk in the manifest:
- Read the Git blob by OID.
- Compute SHA-256 digest of the blob.
- Compare computed digest with stored digest in manifest.
- If mismatch, throw
CasErrorwith codeINTEGRITY_ERROR. - If match, append blob to
buffersarray.
Step 4: Concatenate Encrypted Chunks
let buffer = Buffer.concat(chunks);- All encrypted chunk buffers are concatenated into a single ciphertext buffer.
CRITICAL: This operation loads the entire ciphertext into memory. For large files, this may cause memory exhaustion. See Limitations.
Step 5: Decrypt Buffer
if (manifest.encryption?.encrypted) {
buffer = await this.decrypt({
buffer,
key: encryptionKey,
meta: manifest.encryption,
});
}- Extract nonce and tag from
manifest.encryption. - Create
aes-256-gcmdecipher with key and nonce. - Set authentication tag via
setAuthTag(). - Decrypt the ciphertext:
const nonce = Buffer.from(meta.nonce, 'base64'); const tag = Buffer.from(meta.tag, 'base64'); const decipher = createDecipheriv('aes-256-gcm', key, nonce); decipher.setAuthTag(tag); return Buffer.concat([decipher.update(buffer), decipher.final()]);
- If
decipher.final()throws (due to tag mismatch or corrupted ciphertext), catch and re-throw asCasErrorwith codeINTEGRITY_ERROR.
Step 6: Return Plaintext
return { buffer, bytesWritten: buffer.length };- No streaming decryption: The entire ciphertext must be loaded into memory before decryption. This is a limitation of the current implementation.
- Authentication before decryption: GCM mode ensures that ciphertext integrity is verified before any plaintext is returned. If the tag check fails, no plaintext is leaked.
- Chunk integrity before decryption: SHA-256 verification of encrypted chunks occurs before decryption. This detects corruption at the chunk level.
Every chunk (encrypted or unencrypted) is protected by a SHA-256 digest:
- Digest computation: When a chunk is stored,
crypto.createHash('sha256').update(buf).digest('hex')is computed and stored in the manifest. - Digest verification: When a chunk is read during
restore()orverifyIntegrity(), the digest is recomputed and compared.
-
During restore (
restore()method):- Every chunk is read from Git and its SHA-256 digest is verified.
- If any digest mismatch is detected,
restore()throwsCasErrorwith codeINTEGRITY_ERROR.
-
During integrity verification (
verifyIntegrity()method):- All chunks are read and their SHA-256 digests are verified.
- If any digest mismatch is detected,
verifyIntegrity()returnsfalseand emits anintegrity:failevent.
- Bit-rot: Silent corruption of Git objects on disk.
- Storage errors: Corruption during disk writes or reads.
- Tampering: Intentional modification of chunk blobs.
- Incomplete writes: Partial writes during storage failures.
-
Manifest tampering: If an attacker modifies the manifest to point to different blobs with matching digests, the chunk verification will pass. However:
- For unencrypted content, this results in incorrect data being restored.
- For encrypted content, GCM tag verification will fail unless the attacker also forges the authentication tag (which is computationally infeasible).
-
Rollback attacks: If an attacker replaces a newer manifest with an older one, chunk digests will still verify. Application-level versioning or commit signing is required to prevent rollback.
Issue: The restore() method concatenates all encrypted chunks into a single buffer before decryption:
let buffer = Buffer.concat(chunks);Impact:
- For large encrypted files (e.g., 1GB+), this can cause memory exhaustion.
- Node.js has a maximum buffer size of ~2GB (depending on architecture).
Workaround:
- Avoid encrypting extremely large files with git-cas.
- If large encrypted files are required, implement application-level chunking (e.g., split a 10GB file into 10 separate 1GB files before storing).
Future improvement: Implement streaming decryption to process ciphertext in chunks without full concatenation.
Issue: AES-256-GCM decryption is currently performed on the entire ciphertext as a single operation. The authentication tag is verified only at the end of decryption.
Impact:
- Cannot stream decrypted plaintext to the caller incrementally.
- Cannot detect tampering until the entire ciphertext is processed.
Future improvement: Investigate chunked AEAD modes or encrypt-then-MAC schemes that allow incremental authentication.
Issue: git-cas does not support re-encrypting content with a new key while maintaining the same manifest structure.
Impact:
- If a key is compromised, all content encrypted with that key must be manually re-encrypted by:
- Restoring content with the old key.
- Storing content again with the new key.
- Updating all references to the old manifest tree to the new manifest tree.
Workaround:
- Implement application-level key rotation by maintaining a key version identifier alongside each manifest.
Future improvement: Add a reencrypt() method that re-encrypts content with a new key without requiring full restore.
Issue: While 96-bit nonces have negligible collision probability for practical use cases, the GCM security proof degrades after ~2^32 encryptions with the same key.
Impact:
- If the same key is used to encrypt more than 2^32 files, nonce reuse becomes more likely.
- Nonce reuse with AES-GCM is catastrophic: it allows attackers to recover the plaintext and authentication key.
Mitigation:
- Rotate encryption keys after a reasonable number of operations (e.g., every 1 million encryptions, or every 90 days, whichever comes first).
Issue: The following metadata is stored in plaintext in the manifest:
slug(file identifier)filenamesize(total size of encrypted content)chunksarray (chunk indices, sizes, digests, blob OIDs)
Impact:
- An attacker with access to the repository can infer file structure, sizes, and access patterns.
- Chunk digests may leak information about plaintext content if chunks are small or predictable.
Mitigation:
- If metadata privacy is required, implement application-level encryption of the entire manifest before storing it as a Git blob.
Issue: git-cas does not include versioning or timestamps in the encryption metadata.
Impact:
- An attacker can replace a newer manifest tree with an older one (rollback attack).
- An attacker can duplicate encrypted content across different slugs (replay attack).
Mitigation:
- Use Git commit signing to authenticate manifest trees.
- Implement application-level versioning or monotonic counters.
Git's object database (ODB) is append-only and content-addressed:
- Once a blob, tree, or commit is written, its content is immutable.
- Objects are stored in
.git/objects/and referenced by their SHA-1 (or SHA-256) hash.
git-cas does NOT provide a delete() method because:
- Logical deletion is trivial: Remove the reference to a manifest tree from your application's index.
- Physical deletion is a Git-level operation: Unreferenced objects remain in
.git/objects/until garbage collection.
To physically remove unreferenced objects:
git gc --aggressive --prune=nowImportant:
git gconly removes objects that are not reachable from any ref (branch, tag, commit).- If a manifest tree is still referenced (e.g., in a commit or reflog), its chunks will NOT be pruned.
-
Deleted content may persist: If you "delete" a file by removing its manifest reference, the encrypted chunks remain in
.git/objects/untilgit gcprunes them. -
Reflog prevents immediate pruning: Git's reflog keeps references to old commits for 90 days by default. To prune immediately:
git reflog expire --expire=now --all git gc --prune=now
-
Shallow clones do not remove history: Even if you force-push to remove a commit, the objects remain in the local repository until pruned.
- Do not rely on logical deletion for security: If sensitive content was encrypted and stored, assume the ciphertext remains in the repository until
git gcprunes it. - Prune after sensitive operations: After removing sensitive content, run:
git reflog expire --expire=now --all git gc --aggressive --prune=now
- Consider repository rotation: For highly sensitive data, periodically create a new repository and migrate only non-sensitive content.
git-cas defines the following error codes for security-related operations:
Thrown when:
- A chunk's SHA-256 digest does not match the stored digest in the manifest.
- AES-256-GCM authentication tag verification fails during decryption.
Example:
throw new CasError(
'Chunk 2 integrity check failed',
'INTEGRITY_ERROR',
{ chunkIndex: 2, expected: 'abc123...', actual: 'def456...' },
);Possible causes:
- Corruption of Git objects on disk.
- Tampering with chunk blobs.
- Wrong encryption key used for decryption (GCM tag mismatch).
- Incomplete or interrupted writes.
Recommended action:
- If this occurs during
restore(), the file is corrupted and cannot be recovered without a backup. - If this occurs during
verifyIntegrity(), investigate storage hardware or Git repository health.
Thrown when:
- An encryption key is provided but is not exactly 32 bytes (256 bits).
Example:
throw new CasError(
'Encryption key must be 32 bytes, got 16',
'INVALID_KEY_LENGTH',
{ expected: 32, actual: 16 },
);Possible causes:
- Incorrect key generation (e.g., using 128-bit AES key instead of 256-bit).
- Key truncation during storage or transmission.
- Encoding issues (e.g., base64 decoding resulting in wrong length).
Recommended action:
- Verify key generation logic uses
crypto.randomBytes(32)or equivalent. - Check key storage/retrieval does not corrupt or truncate the key.
Thrown when:
- An encryption key is provided but is not a
BufferorUint8Array.
Example:
throw new CasError(
'Encryption key must be a Buffer or Uint8Array',
'INVALID_KEY_TYPE',
);Possible causes:
- Passing a string instead of a Buffer (e.g.,
"my-secret-key"instead ofBuffer.from("my-secret-key")). - Passing a base64-encoded string without decoding it first.
Recommended action:
- Ensure keys are stored as
BufferorUint8Array. - If keys are stored as hex/base64 strings, decode them before passing to git-cas:
const key = Buffer.from(keyBase64, 'base64');
Thrown when:
- A manifest indicates content is encrypted (
manifest.encryption.encrypted === true) but noencryptionKeyis provided torestore().
Example:
throw new CasError(
'Encryption key required to restore encrypted content',
'MISSING_KEY',
);Possible causes:
- Application logic error: Forgot to pass key to
restore(). - Key was lost or not available in the current environment.
Recommended action:
- Verify the encryption key is available and passed to
restore(). - If the key is lost, the content is permanently inaccessible.
git-cas provides strong at-rest encryption and integrity guarantees through AES-256-GCM and SHA-256 chunk verification. However, it is critical to understand the limitations and caller responsibilities:
- Key management is entirely your responsibility. git-cas does not store or manage keys.
- Encrypted restore is not streaming. Large encrypted files may cause memory issues.
- No key rotation support. Re-encrypting content requires manual restore/store cycles.
- Metadata is not encrypted. File structure and sizes are visible to anyone with repository access.
- Logical deletion does not physically remove data. Use
git gcto prune unreferenced objects.
For questions or security concerns, please review the ROADMAP or file an issue.