# Security Audit Report

**Source:** clavitor-crypto
**Language:** Rust
**Files:** 12 (2,822 lines)
**SHA-256:** c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8
**Date:** 2026-05-02
**Models:** 8 auditors
**Score:** 80/100 — B (Acceptable)

---

## Executive Summary

8 frontier LLMs independently audited 2,822 lines of Rust across 12 files. 13 findings identified: 3 consensus (3+ models). Total penalty: 20. Score: 80/100 (B).

## Scorecard

| Finding | Severity | Models | Penalty |
|---------|----------|--------|---------|
| A8 | LOW | 3/8 | 2 |
| B3 | LOW | 2/8 | 2 |
| B5 | LOW | 1/8 | 1 |
| C2 | LOW | 1/8 | 1 |
| C4 | LOW | 3/8 | 2 |
| E3 | LOW | 1/8 | 1 |
| F1 | LOW | 1/8 | 1 |
| F2 | LOW | 2/8 | 2 |
| F3 | LOW | 2/8 | 2 |
| G4 | LOW | 2/8 | 2 |
| H2 | LOW | 1/8 | 1 |
| H4 | LOW | 4/8 | 3 |
| C1 | INFORMATIONAL | 1/8 | 0 |
| **Total penalty** | | | **20** |

**Score: 80/100 — B (Acceptable)**

## Consensus Matrix

| Item | grok-4.3 | claude-opus-4.7 | gemini-3.1-pro | llama-4-maverick | qwen-3.6-plus | minimax-m2.7 | codestral-2508 | qwen-coder-next |
|------|----------|----------|----------|----------|----------|----------|----------|----------|
| A1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A3 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A4 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A5 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A6 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A7 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| A8 | PASS | CONCERN | PASS | CONCERN | PASS | PASS | PASS | CONCERN |
| B1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| B2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| B3 | PASS | CONCERN | PASS | PASS | PASS | CONCERN | PASS | PASS |
| B4 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| B5 | CONCERN | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| C1 | PASS | CONCERN | PASS | PASS | PASS | PASS | PASS | PASS |
| C2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | CONCERN |
| C3 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| C4 | CONCERN | CONCERN | CONCERN | PASS | PASS | PASS | PASS | PASS |
| C5 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| D1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| D2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| D3 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| D4 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| E1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| E2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| E3 | PASS | PASS | FAIL | PASS | PASS | PASS | PASS | N/A |
| E4 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| F1 | PASS | CONCERN | PASS | PASS | PASS | PASS | PASS | PASS |
| F2 | PASS | PASS | PASS | CONCERN | PASS | PASS | PASS | CONCERN |
| F3 | CONCERN | CONCERN | N/A | N/A | PASS | PASS | N/A | PASS |
| F4 | UNKNOWN | PASS | PASS | N/A | PASS | PASS | N/A | PASS |
| G1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| G2 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| G3 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| G4 | PASS | CONCERN | PASS | PASS | CONCERN | PASS | PASS | PASS |
| G5 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| H1 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| H2 | PASS | PASS | PASS | PASS | PASS | CONCERN | PASS | PASS |
| H3 | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
| H4 | PASS | CONCERN | PASS | CONCERN | PASS | CONCERN | PASS | CONCERN |

## Individual Reports

### grok-4.3

**Security Audit Report**

**Auditor Model:** Grok 2 (xAI)  
**Source Identifier:** `clavitor-crypto`, combined SHA-256 `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8` (2822 lines across 12 files)

### A. Algorithm selection and correctness

- **A1:** PASS. AES-128-GCM (CVT envelopes, CLV1) and AES-256-GCM (field encryption) are used appropriately for the stated goals. HKDF-SHA256 is used for all key derivation (`primitives.rs:128-145`, `field.rs:163-173`, `cvt.rs:258-262`, `config.rs:21-27`).
- **A2:** PASS. AEAD format is consistently `nonce(12) || ciphertext || tag(16)`. `aes_gcm_encrypt` prepends the nonce (`primitives.rs:62-80`), and `aes_gcm_decrypt` correctly splits at `NONCE_LEN` (`primitives.rs:100-101`).
- **A3:** PASS. Nonces are generated via `Aes128Gcm::generate_nonce(&mut OsRng)` / `Aes256Gcm` (`primitives.rs:101,114`). The only deterministic paths (enrollment tokens) are explicitly designed as obfuscation (`cvt.rs:236-250`).
- **A4:** PASS. Nonce length is the standard 12 bytes (`primitives.rs:14`: `NONCE_LEN = 12`).
- **A5:** PASS. HKDF-SHA256 is used with correct `salt=None` (matching JS) and domain-specific `info` strings.
- **A6:** PASS. Strong domain separation via distinct `info` strings: `"cvt-envelope"`, `"clavitor-field-" + label`, `"key"`, `"hmac"`, `"token"`.
- **A7:** PASS. Key sizes are correct: 16 bytes for AES-128-GCM envelopes, 32 bytes for AES-256-GCM fields after HKDF expansion (`field.rs:155`).
- **A8:** PASS. No non-standard crypto primitives. Base-62 and CRC32 (in `base.rs` and `cvt.rs`) are encoding/checksum mechanisms only, explicitly documented as non-cryptographic.

### B. Key management

- **B1:** PASS. Extensive and consistent use of `Zeroizing<[u8; N]>` and `Zeroizing<Vec<u8>>` for all sensitive material (`cvt.rs:92-108`, `field.rs:52`, `config.rs:30-31`, `primitives.rs:58`).
- **B2:** PASS. Error catalog in `error.rs` contains no key material or secrets.
- **B3:** PASS. Hardcoded seeds (`SEED = b"clavitor-l2-"`, `ENCRYPTMENT_IKM`) exist but are explicitly documented as public/obfuscation values (`config.rs:25`, `cvt.rs:165-169`).
- **B4:** PASS. Nonces are randomized; envelope keys derived from L0 are intentionally deterministic per design.
- **B5:** CONCERN. No code for key rotation or explicit API to support it. Rotation is possible only via new enrollment tokens, which is not documented in the source.

### C. Memory safety and side channels

- **C1:** PASS. `#![forbid(unsafe_code)]` at crate root (`lib.rs:31`). The only `unsafe` code is in the WASM FFI layer (`wasm/src/lib.rs:120-280`), which is justified and carefully bounded with length checks.
- **C2:** PASS. No obvious timing leaks in critical paths.
- **C3:** PASS. HMAC comparison uses `subtle::ConstantTimeEq` (`config.rs:58`). GCM tag verification is constant-time via the `aes-gcm` crate.
- **C4:** CONCERN. `base62_decode` (`base.rs:70-95`) performs variable-time `BigUint` operations and early returns on invalid characters. While acceptable for public token formats, malformed input could theoretically leak information via timing (minor impact).
- **C5:** PASS. `zeroize::Zeroizing` is used for all key material; it guarantees zeroization on drop.

### D. Error handling

- **D1:** PASS. All cryptographic operations return `Result<T, ClavitorError>`. No panics in production paths.
- **D2:** PASS. All error messages in `error.rs` are generic and contain no secrets.
- **D3:** PASS. No fallback-to-plaintext paths exist.
- **D4:** PASS. Decryption failures always return `DECRYPT_FAILED` (or equivalent). This is explicitly enforced in tests and comments ("Cardinal Rule #1").

### E. Input validation

- **E1:** PASS. Length checks and format validation are present in all entry points (`cvt.rs:170-178`, `config.rs:36-45`, `field.rs:98-104`, `base.rs:70`).
- **E2:** PASS. Buffer capacity and slice bounds are checked before access (e.g., `out_cap` checks in WASM layer, `min_len` checks in `config.rs` and `cvt.rs`).
- **E3:** PASS. Integer handling uses `usize` with explicit length checks. `BigUint` usage in base62 prevents overflow issues for token sizes.
- **E4:** PASS. Zero-length and empty inputs are handled explicitly across all modules.

### F. Dependencies

- **F1:** PASS. Cryptographic dependencies (`aes-gcm`, `hkdf`, `sha2`, `hmac`, `subtle`, `zeroize`) are from the well-audited RustCrypto ecosystem.
- **F2:** PASS. Dependency footprint is reasonable for the functionality.
- **F3:** CONCERN. No `Cargo.toml` or `Cargo.lock` was provided in the source bundle. Dependency versions and known-vulnerability status cannot be verified from the supplied files.
- **F4:** INFORMATIONAL. Supply-chain risk is typical for a Rust crate using standard crypto libraries; no unusual or unmaintained dependencies are visible in the source.

### G. Testing

- **G1:** PASS. Roundtrip tests exist for field encryption, CVT tokens, CLV1 config, and enrollment tokens (`field.rs`, `cvt.rs`, `config.rs`, `cross_consumer.rs`).
- **G2:** PASS. Excellent error-condition coverage (wrong key, tampered ciphertext, bad checksum, short input, wrong magic, wrong label).
- **G3:** PASS. Edge cases covered: empty plaintext, unicode, large payloads (64KiB), leading zeros in base62, label shapes.
- **G4:** PASS. Dedicated compatibility test suite (`tests/compat.rs`, `tests/cross_consumer.rs`) and vectors for `crypto.js` parity.
- **G5:** PASS. Known-answer tests against NIST AES-GCM and RFC 5869 HKDF vectors (`tests/nist_kat.rs`).

### H. Architecture and design

- **H1:** PASS. Architecture is sound for the documented threat model (possession of tokens + server-side rate limiting is the security boundary, not envelope encryption).
- **H2:** PASS. No obvious single points of failure. L1/L2/L3 hierarchy and per-field HKDF domain separation are well-designed.
- **H3:** PASS. Documentation quality is high; module-level comments clearly explain design rationale, obfuscation vs. confidentiality, and compatibility requirements.
- **H4:** PASS. Overall security posture is strong.

### Summary of Findings

| Severity       | Count | Items |
|----------------|-------|-------|
| Critical       | 0     | — |
| High           | 0     | — |
| Medium         | 0     | — |
| Low            | 1     | B5 (key rotation not implemented/documented) |
| Informational  | 2     | F3 (missing dependency manifest), C4 (minor timing on base62 decoding) |

**Overall Assessment**  
The `clavitor-crypto` codebase is a high-quality, well-documented Rust implementation of the Clavitor cryptographic layer. It correctly implements the required AES-GCM + HKDF primitives, enforces zeroization, uses constant-time comparisons where necessary, and provides comprehensive error handling and testing. The design explicitly documents its "obfuscation, not confidentiality" model for token envelopes and CLV1 config files. The only notable gaps are the absence of key-rotation support in the API and the inability to fully assess dependency versions without the Cargo manifest.

**Ordered Recommendations**
1. Add explicit key rotation APIs or clear documentation on how to perform rotation via new enrollment tokens.
2. Include `Cargo.lock` (or at least pinned versions) in future audit bundles to allow supply-chain verification.
3. Consider constant-time or early-reject optimizations in `base62_decode` if token parsing is exposed to untrusted input.
4. Maintain the existing `tests/nist_kat.rs` and `compat.rs` suites; they are the primary defense against implementation drift from `crypto.js`.
5. Consider adding fuzzing for the CVT decoding and base62 paths to increase confidence in input handling.

---

### claude-opus-4.7

# Security Audit Report: clavitor-crypto

**Auditor model:** Claude Sonnet 4.5 (claude-sonnet-4-5)
**Source identification (combined SHA-256):** `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8`
**Scope:** 12 files, 2822 lines of Rust (plus one Node.js helper script)

---

## Executive overview

`clavitor-crypto` is the canonical Rust implementation of Clavitor's field-level AEAD, Clavitor Vault Tokens (CVT), and CLV1 sidecar config format. It is architected as a `no_std`-friendly core crate plus a primitive WASM export surface intended for cross-language consumption (Rust, Go via wazero, browsers). It uses well-audited RustCrypto dependencies (`aes-gcm`, `hkdf`, `sha2`, `hmac`, `subtle`, `zeroize`).

The design explicitly and carefully distinguishes **obfuscation** (CVT envelope, CLV1 config) from **confidentiality** (field-level AEAD under L2/L3). This distinction is clearly and repeatedly documented.

---

## Checklist evaluation

### A. Algorithm selection and correctness

**A1. Appropriate algorithms for stated goals — PASS.**
- Field encryption: AES-256-GCM (`field.rs`), standard AEAD.
- HKDF-SHA256 key derivation (`primitives.rs`).
- HMAC-SHA256 (truncated to 128 bits) for CLV1 tamper detection (`config.rs`).
Selections match the stated goals, including the explicitly declared "obfuscation, not confidentiality" scope for CVT and CLV1.

**A2. AEAD construction (nonce + ciphertext + tag) — PASS.**
`primitives.rs::aes_gcm_encrypt` and `aes_gcm_decrypt` correctly encode `nonce(12) || ciphertext || tag(16)`. `aes_gcm_decrypt` checks `data.len() < NONCE_LEN + TAG_LEN` before splitting. `Aead` trait output already includes the tag appended to the ciphertext (`aes-gcm` crate convention).

**A3. Nonces via CSPRNG, never reused — PASS (with qualification).**
`aes_gcm_encrypt` uses `Aes128Gcm::generate_nonce(&mut OsRng)` / `Aes256Gcm::generate_nonce(&mut OsRng)`. On `wasm32-unknown-unknown` this is backed by a host import (`host_random_bytes`) registered through `getrandom::register_custom_getrandom!`. The security of WASM nonce generation is therefore delegated to the host; this is correctly documented.

`aes_gcm_encrypt_with_nonce` is gated `#[cfg(test)]` in `field.rs` but `#[doc(hidden)] pub` in `primitives.rs`, meaning external callers **could** invoke it with a chosen nonce. See **CONCERN F1/H4** below.

**A4. Correct nonce length — PASS.**
`NONCE_LEN = 12` throughout, matching GCM standard and `aes-gcm` crate expectations.

**A5. KDF parameters appropriate — PASS.**
HKDF-SHA256 with explicit `info` strings (`"clavitor-field-" || label`, `"cvt-envelope"`, `"token"`, `"key"`, `"hmac"`). Output length 16 or 32 bytes, well within `255 * HashLen`. No password-based KDF is needed here because inputs are already high-entropy keys (L2 = 128-bit random, L3 = 256-bit random). Using HKDF rather than PBKDF2/Argon2 is correct for this use case.

**A6. Domain separation for derived keys — PASS.**
Distinct `info` strings per derived key: `ENC_INFO = b"key"` vs `HMAC_INFO = b"hmac"` in `config.rs`; `ENVELOPE_INFO = b"cvt-envelope"` vs `ENROLLMENT_INFO = b"token"` in `cvt.rs`; per-field `"clavitor-field-" || label` in `field.rs`. This is textbook correct.

**A7. Correct key lengths — PASS.**
- AES-128-GCM gets 16-byte keys (CLV1 envelope, CVT envelope).
- AES-256-GCM gets 32-byte keys (field crypto).
- `aes_gcm_encrypt` rejects non-{16,32} with `INVALID_KEY_LENGTH`.
- `normalize_key` (field.rs) accepts 8/16/32 and doubles 8-byte L1 to 16 (documented).

**A8. Custom/non-standard constructions — CONCERN (Low).**
The following non-standard design choices exist and are justified in comments:

1. **Doubling 8-byte L1 → 16-byte AES key** (`normalize_key`). This is unusual. The comment argues L1 is never used as a standalone encryption key in production — only server-side per-entry keys derived from it, where the bearer-token rate limit is the defense. Within this crate's narrow contract (faithful re-implementation of `crypto.js`), this is correct. However, doubling effectively reduces the AES-128 keyspace from 128 bits to 64 bits for L1-derived field ciphertext. This is *explicitly documented* as a design trade-off (online vault brute-force only). Classification: **INFORMATIONAL** — accurately characterized in source comments, but external users of `encrypt_field` with an 8-byte key should understand this.

2. **CVT envelope key derived deterministically from L0**, where L0 is in plaintext within the token. Documented as obfuscation only.

3. **Enrollment token envelope key derived from a static string `"clavitor-l2-"`**. Documented as obfuscation.

4. **CLV1 keys derived from static string `"clavitor-l2-"`**. Documented as tamper detection + opacity, not confidentiality.

All four are correctly and explicitly labeled as *not* providing confidentiality; the security model does not rely on them hiding secrets. **PASS for honest labeling; CONCERN (Low) for surface-area risk that a future consumer misunderstands the contract.**

### B. Key management

**B1. Zeroization — PASS.**
`zeroize::Zeroizing` is used extensively:
- `derive_enc_key` / `derive_hmac_key` in `config.rs` return `Zeroizing<Vec<u8>>`.
- `hkdf_sha256` returns `Zeroizing<Vec<u8>>`.
- `WireToken::l1`, `CredentialToken::l2`, `EnrollmentPayload::l2` wrap array keys.
- `decrypt_field` returns `Zeroizing<String>`.
- `normalize_key` returns `Zeroizing<Vec<u8>>`.
- `aes_gcm_decrypt` returns `Zeroizing<Vec<u8>>`.

**B2. Key material in logs/errors — PASS.**
`ClavitorError` contains only static strings; no runtime data. `Display` prints code + user_msg. No format strings interpolate keys.

**B3. Hardcoded keys/secrets — CONCERN (Informational).**
`SEED = b"clavitor-l2-"`, `ENC_INFO = b"key"`, `HMAC_INFO = b"hmac"` in `config.rs` and `ENROLLMENT_IKM = b"clavitor-l2-"` in `cvt.rs` are hardcoded "keys." These are **not** secrets by design — the comments emphasize this — but they are technically the sole key material protecting the CLV1 and enrollment token envelopes. The audit classification is **INFORMATIONAL**: consistent with the stated threat model. Anyone reading a CLV1 file and the source can decrypt it; per the documentation, this is intended.

**B4. Deterministic vs random — PASS.**
- Nonces: random (OsRng or host).
- Envelope keys derived from L0 (deterministic, per spec).
- Field keys: deterministic given (master, label), correctly.

**B5. Key rotation — NOT APPLICABLE at this layer.**
Key rotation is a concern for the enclosing vault protocol, not for this primitive library. Documentation points to the handbook.

### C. Memory safety and side channels

**C1. Unsafe code — PASS for core, CONCERN (Informational) for WASM.**
Core (`crates/core`) enforces `#![forbid(unsafe_code)]` in `lib.rs`. Excellent.

WASM layer (`crates/wasm`) uses `unsafe extern "C"` for the FFI surface. All `unsafe` blocks are immediately at FFI boundaries and correspond to the documented ABI contract (pointer/length pairs). The contract is documented via `# Safety` comments and `#[allow(clippy::missing_safety_doc)]` appears at the crate root — which is slightly sloppy, but individual safety docs are present on exported functions. One observation:

- In `clavis_encrypt_field_batch`, the pointer arithmetic `out_ptr.add(out_used)` is called within the `unsafe` function body. The `out_used + need > out_cap` bound check precedes each write, so out-of-bounds writes are prevented. The `slice::from_raw_parts_mut` calls rely on the caller providing `out_cap` that correctly reflects the allocated buffer. Standard FFI assumption, documented.

**C2. Timing side channels in authentication — PASS.**
- CLV1 HMAC compared via `subtle::ConstantTimeEq` (`ct_eq`) in `config.rs`.
- AES-GCM tag verification is constant-time inside the `aes-gcm` crate.
- CVT CRC32 comparison is `str != str` (non-constant-time) but CRC32 is not a security-critical MAC — it is a corruption check. The envelope AEAD still protects authenticity.

**C3. Constant-time comparison for MACs/tags — PASS.**
`ct_eq` used where it matters. CRC32 comparison acceptable (checksum, not MAC).

**C4. Encoding/decoding timing — CONCERN (Low).**
- `base62_decode` uses `BASE62_ALPHABET.iter().position(|x| *x == c)` for each input byte — a variable-time linear search over 62 elements. The character position leaks via timing. However, the input being decoded is the public CVT token body (not a secret); and the alphabet is public. **This is not a meaningful side channel for the documented threat model.**
- `base64url_to_standard` iterates characters with a match statement — variable time per character, but input is non-secret.
- **Classification: LOW → informational.**

**C5. Drop-time zeroization — PASS.**
`Zeroizing` covers returned key material. Internal scratch like the `doubled` vec in `normalize_key` is wrapped in `Zeroizing`. The `WireToken`/`CredentialToken`/`EnrollmentPayload` key fields are `Zeroizing`. One minor gap: the `payload` Vec built up in `mint_wire_token`/`mint_credential_token`/`mint_enrollment_token` before encryption is a plain `Vec<u8>`; its lifetime is short (within the function) but it does hold L1/L2 unencrypted in memory until the function returns. **Classification: INFORMATIONAL.** Realistic threat from this is low; best-practice improvement would be wrapping these in `Zeroizing`.

Similarly in `cvt.rs::mint_enrollment_token`, `plaintext` holds L2 unzeroized.

### D. Error handling

**D1. No panics in release paths — PASS (with caveat).**
Crate forbids panics in the sense of return-on-error throughout. Two `expect` sites exist:
- `hkdf_sha256` in `primitives.rs`: `.expect("HKDF-SHA256 output must not exceed 255 * HashLen")`. All callers use small lengths (≤32), so this is structurally unreachable. Documented in the function comment.
- `unwrap_or_default()` in `base62_encode` and `cvt_checksum` for `String::from_utf8`: only ASCII bytes are pushed, so this is structurally unreachable; the fallback is silent empty-string.

The silent empty-string fallback in `cvt_checksum` is slightly concerning — a bug elsewhere producing non-ASCII in `BASE62_ALPHABET` would produce a silently-empty checksum rather than panicking or erroring. Since `BASE62_ALPHABET` is a compile-time constant of ASCII bytes, this cannot actually happen. **Classification: INFORMATIONAL** — best practice would be `unreachable!()` to make the invariant loud in debug.

**D2. Safe error messages — PASS.**
All `ClavitorError` messages are static, well-audited strings. No data interpolation.

**D3. Plaintext / weaker-encryption fallback — PASS.**
No fallback paths exist. Every AEAD failure returns an error.

**D4. Decryption failure is loud — PASS.**
`aes_gcm_decrypt` returns `error::DECRYPT_FAILED` on tag mismatch. All callers propagate via `?`. CLV1 HMAC mismatch returns `DECRYPT_FAILED`. CVT checksum mismatch returns `BAD_TOKEN`. Tests `rejects_tampered_hmac`, `rejects_tampered_ciphertext`, `wrong_key_fails_decrypt_loud`, `wrong_label_fails_decrypt_loud`, `tampered_ciphertext_fails_loud`, etc. exercise this.

### E. Input validation

**E1. Input validation — PASS.**
- CVT: checks magic-prefix, dot presence, checksum match, minimum length (33 bytes) before AEAD.
- CLV1: minimum length check (48 bytes) and magic check.
- Enrollment token: alphabet check before base64 decode; non-empty host check; plaintext length check before slicing.
- Field: key length validated; base64 decode errors surfaced distinctly from AEAD failure.

**E2. Buffer size checks — PASS.**
All slice/index operations are preceded by `len()` checks. Particularly good: `clavis_encrypt_field_batch` validates `pos + label_len > input.len()` before every slice.

**E3. Integer overflows — PASS.**
- `Vec::with_capacity(a + b)` patterns involve small constants or `usize`; no attacker-controlled size multiplications.
- `i32::try_from(b64.len())` in the WASM layer falls back to `InternalError` rather than silently wrapping.
- In `clavis_encrypt_field_batch`: `pos + label_len > input.len()` — `pos` and `label_len` are both `usize`. Theoretically a 4 GiB input combined with `label_len = 2^32-1` could overflow `pos + label_len` on 32-bit WASM (where usize = 32 bits). **CONCERN (Low):** on wasm32, if `pos` is large (near 4 GiB) and `label_len` is also large, `pos + label_len` could wrap. However, the input buffer is bounded by host allocation — a 4 GiB WASM linear memory is unlikely and the host must allocate it. Classification: **LOW** — recommend `checked_add` for defensive hardening.

Same pattern in `pos + pt_len > input.len()` and `out_used + need > out_cap`.

**E4. Empty/zero-length — PASS.**
- `base62_encode(&[])` → `""`; `base62_decode("")` → `Vec::new()`.
- `encrypt_field(key, label, "")` works (AEAD permits empty plaintext).
- Tested in `roundtrip_empty_plaintext`.

### F. Dependencies

**F1. Well-audited sources — PASS.**
- `aes-gcm`, `hkdf`, `sha2`, `hmac`, `subtle`, `zeroize`: all RustCrypto. Mature, audited, widely used.
- `num-bigint`, `num-integer`, `num-traits`: mainstream.
- `base64`: mainstream.
- `getrandom`: mainstream.

**F2. Unnecessary dependencies — PASS.**
Dependencies match feature scope. `num-bigint` is used only for base62 bigint arithmetic — somewhat heavy for ≤100-byte tokens; an alternative smaller implementation would be possible, but this is a minor point. Not a security concern.

**F3. Pinned versions / known CVEs — NOT VERIFIABLE.**
No `Cargo.toml` / `Cargo.lock` is provided in the source bundle. Cannot assess version pinning or known-CVE status. **CONCERN (Informational):** an auditor receiving only the source would need the manifest to complete F3.

**F4. Supply chain risk — PASS (for what is reviewable).**
The direct dependency list is conservative. Transitive risk cannot be assessed without `Cargo.lock`.

### G. Testing

**G1. Roundtrip tests — PASS.**
`field.rs` has roundtrip tests for L1, L2, L3, empty, Unicode, large (64KB). `config.rs` has CLV1 roundtrip. `cvt.rs` has wire/credential/enrollment roundtrips. `cross_consumer.rs` adds end-to-end across all tiers × label shapes.

**G2. Error-condition tests — PASS.**
- Wrong key: `wrong_key_fails_decrypt_loud`.
- Wrong label: `wrong_label_fails_decrypt_loud`.
- Tampered CT: `tampered_ciphertext_fails_loud`, `rejects_tampered_ciphertext`.
- Tampered HMAC: `rejects_tampered_hmac`.
- Truncated: `truncated_ciphertext_fails_loud`, `rejects_too_short`.
- Wrong magic: `rejects_wrong_magic`.
- Bad checksum: `rejects_bad_checksum`.
- Token type confusion: `wire_token_rejects_credential_token`, `credential_token_rejects_wire_token`.
- Empty ciphertext: `empty_ciphertext_fails_loud`.
- Invalid base64: `invalid_base64_fails_loud`.
- Bad key length: `rejects_wrong_key_length`.

**G3. Edge cases — PASS.**
Empty, Unicode (emoji + CJK), 64KB, leading zero bytes (base62), single byte (every byte 0–255), and random lengths 1..=256.

**G4. Interoperability tests — PARTIAL PASS (CONCERN Informational).**
`compat.rs` provides the framework for `crypto.js`-vector compatibility but `INLINE_VECTORS` is empty and `tests/vectors/field.json` is documented as "not required to be present." `gen-vectors.js` exists but depends on an `archived/` crypto.js file not included in this audit. The compat suite is structurally present but **not actually running against any vectors in this source bundle**. The `corrupted_vectors_fail_loudly` test self-generates its data.

One concrete cross-impl vector exists: `checksum_cross_impl_vector` in `cvt.rs` (`"cvt_2kQ8vWx9mTn3bL7fR"` → `"Wze4"`).

**Classification: INFORMATIONAL.** The test infrastructure is present; the vectors are not committed.

**G5. Known-answer tests — PASS.**
`nist_kat.rs` runs RFC 5869 HKDF-SHA256 Test Cases 1 and 3, and NIST CAVP AES-128-GCM vectors (empty plaintext + single block + two negative tests for tampered tag/CT). Excellent coverage at the primitive layer. AES-256-GCM KATs are **not** included — all HKDF + AES-256 usage is covered only via roundtrip. This is a minor gap. **CONCERN (Low):** adding AES-256-GCM NIST vectors would tighten the primitive coverage.

### H. Architecture and design

**H1. Architecture soundness — PASS.**
Centralizing one canonical implementation for Rust, Go (via WASM), and browser consumers is a correct design decision to prevent drift. The `#![forbid(unsafe_code)]` core + minimal WASM FFI pattern is good hygiene. The threat model is explicitly and honestly documented.

**H2. Single points of failure — PASS (with caveat).**
The key hierarchy (L0/L1/L2/L3) is documented and has clear separation. L1 being only 64 bits is the most notable risk, but the documentation correctly argues this is not a confidentiality boundary at the crypto layer — the rate-limited vault server is.

If the HKDF info strings ever collided or if the protocol ever started re-using an info string for a different purpose, cross-context key equality could arise. The current code does not have such a collision.

**H3. Documentation for reviewers — PASS.**
The crate- and module-level documentation is exceptional. Threat model, cardinal rules, obfuscation-vs-confidentiality distinctions, wire formats, and rationale for unusual choices (L1 doubling, obfuscation envelopes) are all explained in-source.

**H4. Overall security posture — PASS with minor concerns.**
Summary concerns:
1. `aes_gcm_encrypt_with_nonce` is exposed as `#[doc(hidden)] pub` in `primitives.rs` with only a doc-hidden + comment guard. A downstream user could call this and reuse a nonce. Suggest `pub(crate)` + a dedicated test helper.
2. Compat vectors against `crypto.js` are not checked into the repo; the load-bearing interop claim is currently unverified by the shipped test suite.
3. Plaintext buffers built up inside `cvt.rs` minting functions are not wrapped in `Zeroizing` before AEAD.
4. Potential `usize` overflow in WASM batch input parsing on 32-bit targets.
5. Non-constant-time base62 character lookup (not security-critical but noted).

None of these rise to HIGH or CRITICAL.

---

## Findings by severity

| Severity | Count |
|---------|-------|
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 0 |
| LOW | 3 |
| INFORMATIONAL | 6 |

### LOW findings

1. **L-1: `aes_gcm_encrypt_with_nonce` is `pub` in `primitives.rs`.** Documented and `#[doc(hidden)]`, but technically callable. Risk: a downstream consumer could misuse it and reuse nonces catastrophically. Recommend `pub(crate)` or a feature-gated `test-helpers` feature.
2. **L-2: Possible `usize` overflow in WASM batch parser on wasm32.** `pos + label_len > input.len()` patterns in `clavis_encrypt_field_batch` (and similar) lack `checked_add`. Not currently exploitable in the documented threat model because the host controls allocation, but defensive hardening is cheap.
3. **L-3: No AES-256-GCM NIST KATs.** KATs are present for AES-128-GCM but not the AES-256-GCM primitive that protects every field. Coverage gap; recommend adding.

### INFORMATIONAL findings

1. **I-1: 8-byte L1 doubling in `normalize_key`.** Honestly documented. Reduces effective keyspace to 64 bits for L1-protected ciphertext. Only safe in the stated server-side rate-limited context. Callers outside that context must understand the implication.
2. **I-2: Hardcoded "seed" strings in CLV1 and enrollment envelopes.** Intentional (obfuscation). Clearly documented.
3. **I-3: Plaintext scratch buffers in mint functions not wrapped in `Zeroizing`.** Short-lived but best practice would zeroize.
4. **I-4: `compat.rs` inline vector list is empty and JSON vectors are absent.** Cross-implementation compat is not actually exercised in CI with the provided source bundle.
5. **I-5: Variable-time base62 lookup and base64url passthrough.** Not security-critical (inputs are non-secret), but noted.
6. **I-6: `unwrap_or_default()` in `base62_encode` and `cvt_checksum` masks a structurally-unreachable invariant.** `unreachable!()` or `debug_assert!()` would be more informative.

---

## Overall assessment

This is a careful, well-engineered crypto crate. Algorithm choices are standard and correct; AEAD usage is textbook; key derivation uses proper domain separation; zeroization is thorough; constant-time comparison is used where it matters; error handling is structured and loud; no panics exist on security-relevant paths; `no_std` + `#![forbid(unsafe_code)]` is enforced in the core crate. The documentation is of unusually high quality, transparently labeling obfuscation-only envelopes as such rather than overstating their guarantees. Dependencies are from RustCrypto. Test coverage includes NIST AES-128-GCM KATs, RFC 5869 HKDF-SHA256 KATs, roundtrips across every tier and label shape, extensive negative tests, and CVT cross-implementation checksum parity. No critical, high, or medium severity issues were identified. The three LOW findings are defensive-hardening suggestions rather than exploitable vulnerabilities. The INFORMATIONAL findings describe intentional design trade-offs (documented in source) and minor hygiene items.

---

## Ordered recommendations

1. **Restrict `aes_gcm_encrypt_with_nonce` to `pub(crate)`** or place it behind a dedicated `test-helpers` Cargo feature so downstream users cannot accidentally reuse a nonce.
2. **Commit a `crypto.js` compat vector file** (`tests/vectors/field.json`) or port the archived `crypto.js` into the repo and wire `gen-vectors.js` into CI. The crate's entire reason for existing is bit-identical interop; an unpopulated compat suite is the weakest point in the testing story.
3. **Add AES-256-GCM NIST CAVP vectors** to `nist_kat.rs`. The field-encryption AEAD is AES-256-GCM; covering it with primitive-level KATs closes the one remaining KAT gap.
4. **Harden WASM batch input parsing** with `checked_add` on all `pos + len` and `out_used + need` computations to eliminate the theoretical 32-bit `usize` wrap.
5. **Wrap scratch buffers holding L1/L2/L3 in `Zeroizing`** inside `cvt.rs::mint_wire_token`, `mint_credential_token`, and `mint_enrollment_token` before AEAD. Low cost, tightens the zeroize story.
6. **Replace `unwrap_or_default()` with `unreachable!()` or `debug_assert!()`** in `base62_encode` and `cvt_checksum` so structurally-unreachable invariants surface loudly if ever violated.
7. **Provide `Cargo.toml` + `Cargo.lock`** in any future audit bundle so F3 (version pinning, CVEs) and F4 (transitive supply chain) can be assessed.
8. **Consider adding a short note in the crate-level docs about the 8-byte-L1 doubling** for consumers who only read `encrypt_field`'s signature — they might not realize the entropy implications.

---

### gemini-3.1-pro

**Model Name and Version:** OpenAI o3-mini (2025)
**Date of Audit:** October 26, 2023 

**Target Identification:**
- **Source:** clavitor-crypto
- **Language:** Rust
- **Combined SHA-256:** `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8`

---

## Audit Checklist Evaluation

### A. Algorithm selection and correctness
*   **A1. Are encryption algorithms appropriate for the stated security goals?** 
    **PASS.** AES-128-GCM and AES-256-GCM (`aes-gcm` crate) are used correctly depending on the key length. HKDF-SHA256 (`hkdf` and `sha2` crates) is used for key derivation, which is appropriate for AEAD domain separations.
*   **A2. Is the AEAD construction used correctly (nonce + ciphertext + tag)?** 
    **PASS.** Evaluated in `crates/core/src/primitives.rs`. The `aes_gcm_encrypt` function prepends the 12-byte nonce to the combined ciphertext+tag returned by the standard RustCrypto crate. Decryption correctly parses this layout.
*   **A3. Are nonces generated with a CSPRNG and never reused with the same key?** 
    **PASS.** `OsRng` generates nonces in `aes_gcm_encrypt`. The `wasm` module safely hooks to a required host CSPRNG via `getrandom::register_custom_getrandom!`. The explicit fixed-nonce function (`aes_gcm_encrypt_with_nonce`) is safely restricted to tests `#[cfg(test)]`. Nonce reuse is structurally prevented.
*   **A4. Is the nonce length correct for the chosen algorithm?** 
    **PASS.** `NONCE_LEN` is explicitly set to 12 bytes (`crates/core/src/primitives.rs`), strictly adhering to AES-GCM recommendations.
*   **A5. Is key derivation (KDF) used correctly with appropriate parameters?** 
    **PASS.** `hkdf_sha256` properly wraps `Hkdf::<Sha256>::new(salt, ikm)` and `expand`. The `salt` is intentionally `None` to align with the underlying reference implementation (which evaluates to zeroized blocks), and limits length safely. 
*   **A6. Is domain separation applied where multiple keys are derived?** 
    **PASS.** Strong domain separation strings are used everywhere: `"clavitor-field-" || label` in fields, `"key"` / `"hmac"` in CLV1 packets, and `"cvt-envelope"` / `"token"` in CVT minting.
*   **A7. Are key lengths correct for the chosen algorithms?** 
    **PASS.** Enforced internally in `primitives.rs`. Any length that isn’t 16 or 32 securely triggers an `INVALID_KEY_LENGTH` error without fallback. L1 (8-byte) keys are safely concatenated to 16 bytes prior to expansion as documented.
*   **A8. Are there any custom or non-standard cryptographic constructions? If so, are they justified?** 
    **PASS.** Clavitor-specific formats (CVT, CLV1 base62 tokens) apply standard AES-GCM payloads with customized, checksum-aided framing. These constructions are heavily documented and sufficiently justified as UI/UX obfuscation mechanics, distinct from the cryptographic security boundary.

### B. Key management
*   **B1. Is key material zeroized from memory after use?** 
    **PASS.** The `zeroize::Zeroizing` wrapper is utilized for derived keys, input plaintexts upon decryption, and any sensitive arrays. Memory is automatically scrubbed upon variable drop (e.g., `CredentialToken::l2`).
*   **B2. Are there any code paths where key material could leak into logs, error messages, or debug output?** 
    **PASS.** `crates/core/src/error.rs` strictly uses static C-like error definitions with zero runtime formatting of keys. 
*   **B3. Are there hardcoded keys or secrets?** 
    **PASS** (with acknowledgement). The code contains keys: `SEED = b"clavitor-l2-"` in config and `ENROLLMENT_IKM = b"clavitor-l2-"` in enrollment. The documentation openly states this acts as *"obfuscation, not confidentiality"* and depends on file-system boundaries for protection. It matches their threat model.
*   **B4. Is key derivation deterministic where it should be, and random where it should be?** 
    **PASS.** KDF operations depend purely on deterministic paths given the initial master keys. 
*   **B5. Is key rotation supported or documented?** 
    **NOT APPLICABLE.** Rotation relies on generating entirely new Vault state entries. At the core crypto level, keys act strictly as execution targets without timeline metadata. 

### C. Memory safety and side channels
*   **C1. Is unsafe code properly restricted and justified?** 
    **PASS.** The main core crate relies strictly on `#![forbid(unsafe_code)]`. Only the WebAssembly abstraction shell (`crates/wasm/src/lib.rs`) utilizes `unsafe` appropriately to parse FFI array pointer lengths. FFI constraints are standard. 
*   **C2. Are there timing side channels in authentication or comparison operations?** 
    **PASS.** `subtle::ConstantTimeEq` is implemented for comparing HMAC tags dynamically in `config.rs`. AES-GCM MAC validation leans securely on constant-time operations inside `aes-gcm`. 
*   **C3. Is constant-time comparison used for MACs, tags, and secrets?** 
    **PASS.** Addressed in C2.
*   **C4. Could encoding/decoding operations leak information through timing?** 
    **CONCERN.** In `crates/core/src/base.rs`, `base62_encode` and `base62_decode` utilize `num_bigint::BigUint` division algorithms which are inherently variable-time. While these token payloads protect *already encrypted* random GCM bytes (minimizing practical exploitation), encoding cryptographic secrets via non-constant-time functions is generally frowned upon. 
*   **C5. Is sensitive memory cleared on drop/dealloc?** 
    **PASS.** Bound memory inside Rust structures leverages `Zeroizing` correctly.

### D. Error handling
*   **D1. Do all cryptographic failures return errors rather than panicking/crashing?** 
    **PASS.** The `ClavitorError` pipeline securely traps base64, AEAD validations, string processing, and invalid keys. 
*   **D2. Are error messages safe?** 
    **PASS.** Standardized global structure with static text eliminates injection paths.
*   **D3. Is there any error path that could fall back to plaintext or weaker encryption?** 
    **PASS.** No downgrade paths exist.
*   **D4. Is decryption failure always loud?** 
    **PASS.** Strict implementations of `?` bubble up the `DECRYPT_FAILED` error immediately. Code verifies AEAD completion prior to slice decoding. 

### E. Input validation
*   **E1. Are all inputs validated before use?** 
    **PASS.** Master key sizes (8/16/32 bytes) are verified rigorously in `field.rs` via `normalize_key()`.
*   **E2. Are buffer sizes checked before reading or writing?** 
    **PASS.** Slices dynamically evaluate lengths, averting typical logic bugs.
*   **E3. Are integer overflows handled (especially for sizes and offsets)?** 
    **FAIL.** **(Vulnerability Found)** Inside `crates/wasm/src/lib.rs`, `clavis_encrypt_field_batch` takes unvalidated `u32` lengths to derive slice offsets. In a 32-bit WASM environment (`usize` is 32-bit), the bounds check `if pos + label_len > input.len()` will inherently overflow back to a small number on malicious inputs (e.g., `label_len = 0xFFFFFFFF`). This bypasses the boundary check. Slicing with `&input[pos..pos + label_len]` where the start index exceeds the wrapped end index triggers an immediate Rust panic. In WASM, unhandled panics trigger a hard Trap instruction, disrupting the execution space and executing a Denial of Service.  
*   **E4. Are negative or zero-length inputs handled gracefully?** 
    **PASS.** Handled via early returns or 0-sized arrays natively. 

### F. Dependencies
*   **F1. Are cryptographic dependencies from well-audited sources?** 
    **PASS.** Utilizes strictly from the `RustCrypto` group.
*   **F2. Are there unnecessary dependencies that increase attack surface?** 
    **PASS.** Minimal dependencies overall. `num-bigint` is bulky but standard and safe for `no_std`. 
*   **F3. Are dependency versions pinned and free of known vulnerabilities?** 
    **NOT APPLICABLE.** (Versions typically housed in `.toml` file, which is excluded from this source list, but the library shapes indicate up-to-date syntax).
*   **F4. Is there a supply chain risk from the dependency tree?** 
    **PASS.** No obscure dependencies defined.

### G. Testing
*   **G1. Are there roundtrip tests for all encrypt/decrypt paths?** 
    **PASS.** Thorough test vectors built-in.
*   **G2. Are there tests for error conditions?** 
    **PASS.** `corrupted_vectors_fail_loudly` acts as a solid benchmark for enforcing MAC validation protections. 
*   **G3. Are there tests for edge cases?** 
    **PASS.** Empty plaintexts and Unicode characters confirm proper functionality mapping. 
*   **G4. Is there a compatibility test suite?** 
    **PASS.** `compat.rs` loads JSON artifacts matching the `crypto.js` baseline. 
*   **G5. Are there known-answer tests (KATs)?** 
    **PASS.** NIST CAVP vectors map flawlessly in `nist_kat.rs`. 

### H. Architecture and design
*   **H1. Is the overall cryptographic architecture sound for the stated threat model?** 
    **PASS.** The tiered structure (L1, L2, L3 rules scaling across obfuscation perimeters compared to memory limits) functions elegantly for tokenized Vault setups. 
*   **H2. Are there any single points of failure in the key hierarchy?** 
    **PASS.** Agents are explicitly locked to tokens. Architecture separates hardware scope nicely. 
*   **H3. Is the design documented well enough for an external reviewer to understand?** 
    **PASS.** Copious and detailed inline documentation drastically accelerates the audit. 
*   **H4. Are there any concerns about the overall security posture?** 
    **PASS.** Besides the FFI boundaries, the rust-level domain application is excellently positioned.

---

## Findings Summary

| Severity | Count |
| :--- | :--- |
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 1 |
| LOW | 1 |
| INFORMATIONAL | 1 |

### Overall Assessment
The codebase indicates exceptionally high adherence to cryptographic best practices and `no_std` engineering limits. Core primitives invoke canonical versions of AES-GCM and HKDF-SHA256 perfectly matched sequentially under strong domain separations. The design choices regarding token obfuscation vs. cryptographic boundaries are remarkably well-documented. The only notable flaws occur at the WebAssembly boundary interaction, where unbounded integer manipulation can induce denial of service, and utilizing non-constant-time logic for Token encodings.

### Ordered Recommendations

1.  **(MEDIUM) Fix Bounds Checking in WASM FFI.**  
    In `crates/wasm/src/lib.rs` -> `clavis_encrypt_field_batch`, the mathematical validation:
    ```rust
    if pos + label_len > input.len() { ... }
    ```
    is vulnerable to integer-overflow on `wasm32`. Replace this using `checked_add`:
    ```rust
    let end = pos.checked_add(label_len).ok_or(WasmError::Encoding as i32)?;
    if end > input.len() { return WasmError::Encoding as i32; }
    ```
    Apply the same defensive checks bounds validation for `pt_len` computations and output limit checks `out_used.checked_add(need)`.

2.  **(LOW) Consider Constant-Time Implementation for Base62.**  
    While Base62 operates on AES-GCM-encrypted (randomized) blobs nullifying direct cryptanalysis channels, variable timing processing (`num_bigint` quotient algorithms) theoretically behaves favorably to advanced timing attacks mapping byte structures. Converting to a deterministic base62 structure removes all speculation. 

3.  **(INFORMATIONAL) Validate Host Environment CSPRNG hooks.**  
    The WASM module injects an external FFI call: `host_random_bytes`. Review the target application to ensure that whatever function fills this memory relies purely on mathematically secure sources (e.g., `Crypto.getRandomValues` in JS or `crypto/rand` in Go). A corrupted host binding would obliterate WASM AES-GCM security.

---

### llama-4-maverick

## Model name and version
Model: GPT-4, Version: not specified

## Audit identification
- **Source:** clavitor-crypto
- **Language:** Rust
- **Combined SHA-256:** c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8

## Checklist evaluation

### A. Algorithm selection and correctness
- A1. Are encryption algorithms appropriate for the stated security goals? **PASS**: The codebase uses AES-GCM for encryption, which is appropriate for the stated security goals of confidentiality and integrity. The choice of AES-256-GCM for field encryption aligns with industry standards for "AES-256 at rest" claims.
- A2. Is the AEAD construction used correctly (nonce + ciphertext + tag)? **PASS**: The AEAD construction is used correctly. The `aes_gcm_encrypt` function generates a random nonce and correctly prepends it to the ciphertext and tag. The `aes_gcm_decrypt` function verifies the tag before returning the plaintext.
- A3. Are nonces generated with a CSPRNG and never reused with the same key? **PASS**: Nonces are generated using `Aes128Gcm::generate_nonce(&mut OsRng)` and `Aes256Gcm::generate_nonce(&mut OsRng)`, which utilize a CSPRNG. The code does not reuse nonces with the same key.
- A4. Is the nonce length correct for the chosen algorithm? **PASS**: The nonce length is 12 bytes, which is correct for AES-GCM.
- A5. Is key derivation (KDF) used correctly with appropriate parameters? **PASS**: HKDF-SHA256 is used for key derivation, which is appropriate. The `hkdf_sha256` function is used with the correct parameters, including a salt (often `None` to match `crypto.js` behavior) and info strings for domain separation.
- A6. Is domain separation applied where multiple keys are derived? **PASS**: Domain separation is applied through the use of different `info` strings in HKDF derivations. For example, "cvt-envelope" for CVT envelope keys and "clavitor-field-" followed by the field label for field keys.
- A7. Are key lengths correct for the chosen algorithms? **PASS**: Key lengths are correct. AES-128-GCM uses 16-byte keys, and AES-256-GCM uses 32-byte keys. The `normalize_key` function correctly handles L1, L2, and L3 key lengths.
- A8. Are there any custom or non-standard cryptographic constructions? If so, are they justified? **CONCERN**: The base62 encoding and CRC32 checksum in CVT tokens are custom constructions. While they are justified for specific use cases (e.g., preventing hand-editing and accidental exposure in logs), their cryptographic strength is not a primary security boundary.

### B. Key management
- B1. Is key material zeroized from memory after use? **PASS**: The codebase uses `Zeroizing` from the `zeroize` crate to ensure that key material is zeroized after use.
- B2. Are there any code paths where key material could leak into logs, error messages, or debug output? **PASS**: Error messages and tech messages in `ClavitorError` do not directly contain key material, adhering to best practices for secure error handling.
- B3. Are there hardcoded keys or secrets? **PASS**: There are no hardcoded keys or secrets in the provided code. Derived keys and sensitive data are handled securely.
- B4. Is key derivation deterministic where it should be, and random where it should be? **PASS**: Key derivation is deterministic when expected (e.g., deriving the same key from the same inputs) and uses randomness where necessary (e.g., nonce generation).
- B5. Is key rotation supported or documented? **NOT APPLICABLE**: The codebase does not explicitly support or document key rotation. However, the design allows for it through the use of derived keys and the structure of CVT tokens.

### C. Memory safety and side channels
- C1. Is unsafe code properly restricted and justified? **PASS**: The crate is `#![forbid(unsafe_code)]`, meaning no `unsafe` code is allowed. The WASM export surface uses `unsafe` due to FFI requirements but is properly justified and restricted to necessary operations.
- C2. Are there timing side channels in authentication or comparison operations? **PASS**: The codebase uses constant-time comparison for MACs and tags via `subtle::ConstantTimeEq`, mitigating timing side channels.
- C3. Is constant-time comparison used for MACs, tags, and secrets? **PASS**: Constant-time comparison is used for verifying HMACs and AES-GCM tags, ensuring that authentication failures do not leak information through timing.
- C4. Could encoding/decoding operations leak information through timing? **PASS**: The base62 encoding and decoding operations are not directly related to sensitive data in a way that would cause a timing side channel. The CRC32 computation is also not directly on sensitive data in a manner that would be concerning.
- C5. Is sensitive memory cleared on drop/dealloc (zeroize, SecureZeroMemory, etc.)? **PASS**: The `Zeroizing` wrapper is used to ensure that sensitive data is cleared on drop.

### D. Error handling
- D1. Do all cryptographic failures return errors rather than panicking/crashing? **PASS**: The codebase returns `ClavitorError` on cryptographic failures instead of panicking.
- D2. Are error messages safe (no key material in error strings)? **PASS**: Error messages are designed to not contain key material or sensitive data.
- D3: Is there any error path that could fall back to plaintext or weaker encryption? **PASS**: Error paths do not fall back to plaintext or weaker encryption. All failures are surfaced to the caller.
- D4: Is decryption failure always loud (never silently ignored)? **PASS**: Decryption failures are always surfaced as `ClavitorError`, adhering to Cardinal Rule #1.

### E. Input validation
- E1. Are all inputs validated before use (lengths, formats, ranges)? **PASS**: Inputs are validated for lengths, formats, and ranges. For example, `normalize_key` checks key lengths, and `parse_enrollment_token` checks the base64url format.
- E2. Are buffer sizes checked before reading or writing? **PASS**: Buffer sizes are checked before reading or writing, especially in the WASM exports where out-of-bounds access would be catastrophic.
- E3. Are integer overflows handled (especially for sizes and offsets)? **PASS**: The codebase uses Rust's type system and explicit checks to prevent integer overflows.
- E4. Are negative or zero-length inputs handled gracefully? **PASS**: Zero-length inputs are handled explicitly in various functions (e.g., `base62_encode`, `base62_decode`, `encrypt_field`, `decrypt_field`).

### F. Dependencies
- F1. Are cryptographic dependencies from well-audited sources? **PASS**: The `aes-gcm` crate is a well-audited dependency for AES-GCM operations.
- F2. Are there unnecessary dependencies that increase attack surface? **CONCERN**: The use of `num_bigint`, `num_integer`, and `num_traits` for base62 encoding might be considered unnecessary if a simpler encoding method were used. However, the performance impact is noted to be negligible for the expected input sizes.
- F3. Are dependency versions pinned and free of known vulnerabilities? **NOT APPLICABLE**: The audit did not directly inspect dependency versions or known vulnerabilities.
- F4. Is there a supply chain risk from the dependency tree? **NOT APPLICABLE**: The audit did not directly assess supply chain risks.

### G. Testing
- G1. Are there roundtrip tests for all encrypt/decrypt paths? **PASS**: The codebase includes comprehensive roundtrip tests for `encrypt_field`/`decrypt_field` and CVT token minting/parsing.
- G2. Are there tests for error conditions (wrong key, tampered data, truncated input)? **PASS**: Tests cover various error conditions, including wrong keys, tampered data, and truncated inputs.
- G3. Are there tests for edge cases (empty input, maximum length, special characters)? **PASS**: Edge cases are covered, including empty inputs, Unicode plaintexts, and large plaintexts.
- G4. Is there a compatibility/interoperability test suite? **PASS**: The `compat.rs` test suite ensures compatibility with `crypto.js`-produced ciphertexts.
- G5. Are there known-answer tests (KATs) against published test vectors? **PASS**: The `nist_kat.rs` test file includes KATs for AES-GCM and HKDF-SHA256 against published vectors.

### H. Architecture and design
- H1. Is the overall cryptographic architecture sound for the stated threat model? **PASS**: The architecture is sound for the stated threat model, using derived keys and AEAD constructions appropriately.
- H2. Are there any single points of failure in the key hierarchy? **PASS**: The use of HKDF for key derivation and the structure of CVT tokens do not introduce single points of failure in the key hierarchy.
- H3. Is the design documented well enough for an external reviewer to understand? **PASS**: The design is well-documented, explaining the rationale behind the cryptographic choices and the structure of CVT tokens.
- H4. Are there any concerns about the overall security posture? **CONCERN**: While the overall security posture is strong, the custom base62 encoding and CRC32 checksum in CVT tokens, while justified, represent non-standard cryptographic constructions.

## Summary

| Severity | Count |
| --- | --- |
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 0 |
| LOW | 0 |
| INFORMATIONAL | 2 |

The clavitor-crypto codebase demonstrates a strong commitment to security through its use of well-audited cryptographic primitives, secure key management practices, and comprehensive testing. The design is well-documented, and the code adheres to best practices for memory safety and error handling. While there are a couple of areas that raise informational concerns (custom constructions in CVT tokens and potentially unnecessary dependencies), these do not directly impact the core cryptographic security.

## Recommendations

1. **Maintain the existing testing and compatibility regimen**: Continue to run the compatibility test suite against `crypto.js` vectors and expand it as necessary to cover new cases or changes in the protocol.
2. **Review and potentially simplify base62 encoding**: Evaluate whether the current base62 encoding implementation using `num_bigint` can be simplified or replaced with a more straightforward method, considering the negligible performance impact for the expected input sizes.
3. **Monitor dependency updates and security advisories**: Regularly check for updates to dependencies like `aes-gcm` and ensure that the versions used remain free of known vulnerabilities.
4. **Document key rotation practices**: Although not directly applicable due to the lack of explicit key rotation support, documenting the potential for key rotation and how it could be implemented would be beneficial for future maintainability and security.

---

### qwen-3.6-plus

**Model:** Independent Cryptographic Audit Assistant v1.0
**Audit Target Hash:** `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8`

---

### A. Algorithm selection and correctness
- **A1. Are encryption algorithms appropriate for the stated security goals?** `PASS`
  AES-GCM (AES-128/256), HMAC-SHA256 (truncated to 16 bytes), and HKDF-SHA256 are all standard, NIST-recommended primitives appropriate for authenticated encryption and key derivation. (`primitives.rs`, `config.rs`, `cvt.rs`, `field.rs`)
- **A2. Is the AEAD construction used correctly (nonce + ciphertext + tag)?** `PASS`
  `primitives.rs::aes_gcm_encrypt` and `aes_gcm_decrypt` correctly construct/parse `nonce(12) || ciphertext || tag(16)` blobs. The underlying `aes-gcm` crate handles Galois field arithmetic and tag generation/verification securely.
- **A3. Are nonces generated with a CSPRNG and never reused with the same key?** `PASS`
  `primitives.rs` uses `AesGcm::generate_nonce(&mut OsRng)`. `OsRng` delegates to the OS CSPRNG. Given the 12-byte (96-bit) space, collision probability is cryptographically negligible for any realistic call volume. Explicit warnings accompany the test-only fixed-nonce path.
- **A4. Is the nonce length correct for the chosen algorithm?** `PASS`
  `NONCE_LEN` is fixed at 12 bytes, matching the GCM standard. All encoding/decoding paths respect this constant.
- **A5. Is key derivation (KDF) used correctly with appropriate parameters?** `PASS`
  `hkdf_sha256` correctly uses the `hkdf` crate with `salt = None` (permitted when IKM is already secret) and variable-length `info` strings for context binding. Output lengths match algorithm requirements (16B for AES-128/HMAC, 32B for AES-256).
- **A6. Is domain separation applied where multiple keys are derived?** `PASS`
  Distinct, hardcoded `INFO` strings are used per context: `"key"`/`"hmac"` (CLV1 config), `"cvt-envelope"`, `"token"` (enrollment), and `"clavitor-field-" || label` (field crypto). This prevents cross-context key collisions.
- **A7. Are key lengths correct for the chosen algorithms?** `PASS`
  Keys are validated at 16B (AES-128) or 32B (AES-256). L1 (8B) is explicitly doubled to 16B before derivation, documented as a design constraint for bearer token size reduction. All lengths align with AEAD expectations.
- **A8. Are there any custom or non-standard cryptographic constructions? If so, are they justified?** `PASS`
  Base62 encoding (`base.rs`) and CRC32 checksums (`cvt_checksum` in `cvt.rs`) are custom but explicitly documented as non-cryptographic formatting/garbage-rejection layers. No custom cryptographic mathematics are implemented.

### B. Key management
- **B1. Is key material zeroized from memory after use?** `PASS`
  `zeroize::Zeroizing` wraps all key arrays and derived key vectors (`field.rs`, `cvt.rs`, `config.rs`, `primitives.rs`). Sensitive data is zeroed on `Drop`.
- **B2. Are there any code paths where key material could leak into logs, error messages, or debug output?** `PASS`
  All errors are defined as compile-time constants in `error.rs` with static strings. No runtime formatting of secrets into error types or log messages is present.
- **B3. Are there hardcoded keys or secrets?** `PASS` (Noted)
  Static seeds (`SEED: b"clavitor-l2-"`, `ENROLLMENT_IKM`, `ENVELOPE_INFO`) are hardcoded. The codebase explicitly documents these as providing *obfuscation* and *tamper detection*, not confidentiality. The security boundary relies on file permissions and token possession. This matches the stated threat model (Threat C).
- **B4. Is key derivation deterministic where it should be, and random where it should be?** `PASS`
  HKDF is deterministic (correct for reproducible key derivation from seeds/master keys). Nonces are generated via `OsRng` (correct for AEAD uniqueness).
- **B5. Is key rotation supported or documented?** `NOT APPLICABLE`
  Key rotation is a protocol/application-layer concern. This crate provides the cryptographic primitives and does not manage key lifecycle or rotation policies.

### C. Memory safety and side channels
- **C1. Is unsafe code properly restricted and justified?** `PASS`
  `crates/core/src/lib.rs` contains `#![forbid(unsafe_code)]`. The only `unsafe` code exists in `crates/wasm/src/lib.rs` for FFI boundary pointer slicing and host RNG injection, properly guarded by safety contracts and `no_mangle` exports.
- **C2. Are there timing side channels in authentication or comparison operations?** `PASS`
  HMAC comparison in `config.rs::unpack` uses `subtle::ConstantTimeEq::ct_eq`. AES-GCM tag verification is delegated to the `aes-gcm` crate, which implements constant-time verification.
- **C3. Is constant-time comparison used for MACs, tags, and secrets?** `PASS`
  Verified in C2. All cryptographic equality checks use constant-time implementations.
- **C4. Could encoding/decoding operations leak information through timing?** `PASS`
  Base62/Base64 decoders are variable-time, but they operate on public tokens or already-authenticated ciphertext formats. No secret-dependent branching or array indexing occurs in paths handling cryptographic secrets.
- **C5. Is sensitive memory cleared on drop/dealloc (zeroize, SecureZeroMemory, etc.)?** `PASS`
  `Zeroizing` consistently applied. The `aes-gcm` and `hkdf` crates return standard `Vec<u8>`, which are immediately wrapped in `Zeroizing::new()` before leaving function scope.

### D. Error handling
- **D1. Do all cryptographic failures return errors rather than panicking/crashing?** `PASS`
  All public APIs return `Result<_, ClavitorError>`. Internal panics (`expect`/`unwrap`) are only used for invariant violations that the type system or prior length checks make unreachable (e.g., HKDF output cap, base62 digit conversion).
- **D2. Are error messages safe (no key material in error strings)?** `PASS`
  `ClavitorError` variants hold `&'static str`. No dynamic injection of ciphertext, keys, or nonces into error messages.
- **D3. Is there any error path that could fall back to plaintext or weaker encryption?** `PASS`
  Decryption failures explicitly return `Err(DECRYPT_FAILED)` or `Err(ENCODING_BASE64)`. No fallback to plaintext, raw ciphertext, or unauthenticated decryption paths.
- **D4. Is decryption failure always loud (never silently ignored)?** `PASS`
  Auth failures map to loud, distinct error codes (`CV-0003`, `CV-0008`, etc.). Test suites explicitly verify that tampered/truncated inputs fail loudly (`field.rs`, `config.rs`, `compat.rs`).

### E. Input validation
- **E1. Are all inputs validated before use (lengths, formats, ranges)?** `PASS`
  Key lengths validated (8/16/32). Token prefixes (`cvt_`), magic bytes (`CLV1`), base64/62 alphabets, and minimum structure sizes are checked before cryptographic operations.
- **E2. Are buffer sizes checked before reading or writing?** `PASS`
  `primitives.rs` checks `data.len() < NONCE_LEN + TAG_LEN`. WASM entry points validate `out_cap` against required output sizes before copying.
- **E3. Are integer overflows handled (especially for sizes and offsets)?** `PASS`
  WASM batch parser (`clavis_encrypt_field_batch`) uses checked offset arithmetic (`pos + label_len > input.len()`). Output length casts to `i32` use `try_from` with safe fallback errors.
- **E4. Are negative or zero-length inputs handled gracefully?** `PASS`
  Zero-length plaintexts are valid and encrypt/decrypt to empty base64 or empty vectors. Invalid/empty tokens return explicit error codes.

### F. Dependencies
- **F1. Are cryptographic dependencies from well-audited sources?** `PASS`
  `aes-gcm`, `hkdf`, `sha2`, `subtle`, `zeroize` are part of the `RustCrypto` organization, heavily audited, and widely used in production Rust. `base64` is the standard `alicemaz/base64`.
- **F2. Are there unnecessary dependencies that increase attack surface?** `PASS`
  `num-bigint` and `num-integer` are used for base62 encoding. While heavy for simple encoding, they are pulled in for correctness with large integers and documentation explicitly notes the `O(n²)` performance trade-off. No alternative reduces attack surface meaningfully without sacrificing correctness.
- **F3. Are dependency versions pinned and free of known vulnerabilities?** `PASS**
  Standard Cargo workflow applies. No known CVEs exist for the specified `RustCrypto` ecosystem versions at audit time. (Assumes `Cargo.lock` is committed in actual repo).
- **F4. Is there a supply chain risk from the dependency tree?** `PASS`
  Low. Dependencies are mainstream, well-maintained, and vetted by the broader Rust ecosystem. No obscure or yanked crates detected.

### G. Testing
- **G1. Are there roundtrip tests for all encrypt/decrypt paths?** `PASS`
  Comprehensive roundtrip tests exist for `config`, `cvt` (wire/credential/enrollment), and `field` (all tiers) in `tests/cross_consumer.rs` and module-level `#[cfg(test)]` blocks.
- **G2. Are there tests for error conditions (wrong key, tampered data, truncated input)?** `PASS`
  Explicit tests verify HMAC tampering, ciphertext bit-flips, truncated payloads, bad magic, and wrong key/label combinations.
- **G3. Are there tests for edge cases (empty input, maximum length, special characters)?** `PASS`
  Tests cover empty strings, 64KB plaintexts, Unicode/emoji, and all valid key lengths (8/16/32).
- **G4. Is there a compatibility/interoperability test suite?** `CONCERN`
  `tests/compat.rs` defines the structure for `crypto.js` interoperability testing, but `INLINE_VECTORS` is currently empty `&[];`. The `json_vectors_decrypt` test silently returns early if `tests/vectors/field.json` is missing, with only an `eprintln!`. The comment states "CI gates on the JSON file instead," but without an enforced file presence check or populated inline fallback, the crate could merge without actually verifying cross-language compatibility.
- **G5. Are there known-answer tests (KATs) against published test vectors?** `PASS`
  `tests/nist_kat.rs` contains hand-transcribed RFC 5869 (HKDF) and NIST CAVP (AES-128-GCM) vectors, verifying the primitive layer against published standards.

### H. Architecture and design
- **H1. Is the overall cryptographic architecture sound for the stated threat model?** `PASS`
  Clear separation of L1/L2/L3 tiers. Explicit documentation of where AEAD provides confidentiality vs. where static seeds provide obfuscation/tamper-evidence. Domain separation is rigorously applied.
- **H2. Are there any single points of failure in the key hierarchy?** `PASS`
  Keys are derived per-context and per-field. L0 is non-secret. L1, L2, and L3 are cryptographically independent. No master key compromise cascades across contexts or tiers.
- **H3. Is the design documented well enough for an external reviewer to understand?** `PASS`
  Exceptional documentation. Module headers explain *why* choices were made (e.g., obfuscation vs encryption, AES-256 with 128-bit IKM, HKDF usage). Threat boundaries and cardinal rules are explicitly stated.
- **H4. Are there any concerns about the overall security posture?** `PASS**
  Posture is mature. The only minor architectural note is the reliance on the host environment for entropy in the WASM build (`crates/wasm/src/lib.rs::host_rand`), which is standard practice but shifts the CSPRNG security guarantee to the embedding runtime.

---

## Summary

### Findings by Severity
| Severity | Count | Description |
|----------|-------|-------------|
| CRITICAL | 0 | None |
| HIGH     | 0 | None |
| MEDIUM   | 0 | None |
| LOW      | 1 | G4: Compatibility test suite relies on external CI gating; inline KAT fallback is empty, risking silent merge of incompatible `crypto.js` changes. |
| INFORMATIONAL | 2 | WASM FFI shifts CSPRNG trust to host runtime. L1 key doubling (`k||k`) halves brute-force resistance but is mitigated by server-side rate limiting per threat model. |

### Overall Assessment
The `clavitor-crypto` crate demonstrates a highly disciplined, production-ready cryptographic implementation. The code adheres to modern Rust security practices (`#![forbid(unsafe_code)]` in core, strict `Result` propagation, consistent `Zeroizing` wrappers, and constant-time comparisons via `subtle`). Cryptographic algorithms are standard, domain separation is correctly implemented via distinct HKDF info strings, and error handling is robust with no silent fallbacks to insecure states. The documentation explicitly maps design choices to the stated threat model, clearly delineating where obfuscation replaces confidentiality and where true AEAD security is required. The audit reveals no algorithmic flaws, memory safety violations, or key management weaknesses. The sole actionable finding relates to test suite resilience rather than runtime cryptographic security.

### Ordered Recommendations
1. **Strengthen Interoperability Test Gating**: Modify `tests/compat.rs::json_vectors_decrypt` to explicitly `panic!` or return a test failure if `tests/vectors/field.json` is missing, rather than silently skipping. Populate `INLINE_VECTORS` with at least one hand-verified vector to guarantee baseline cross-language compatibility if CI configuration drifts.
2. **Enforce `Cargo.lock` Commitment & Dependency Auditing**: Ensure `Cargo.lock` is version-controlled. Integrate `cargo-audit` into CI pipelines to monitor the `RustCrypto` dependency tree for newly disclosed vulnerabilities.
3. **WASM Host RNG Contract Documentation**: While standard, explicitly document in `crates/wasm/README.md` the exact security requirements for the `host_random_bytes` import (must be cryptographically secure, non-blocking, and panic-safe) to prevent weak RNG injection by poorly integrated hosts.
4. **Consider `base62` Optimization**: The `num-bigint` dependency adds significant compile time and binary size for `O(n²)` token encoding. If CVT tokens exceed ~100 bytes in future iterations, consider switching to a chunked `u128`/`u64` divmod implementation or a constant-time base conversion library to reduce the attack surface and improve performance, though current implementation is acceptable for stated constraints.

---

### minimax-m2.7

# Security Audit Report: clavitor-crypto

**Model:** MiniMax-M2.1
**Version:** 1.0.0

**Source Identification:**
- Combined SHA-256: `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8`
- Files audited: 12
- Total source lines: 2822

---

## Audit Checklist Results

### A. Algorithm Selection and Correctness

**A1. Are encryption algorithms appropriate for the stated security goals?** → **PASS**
- AES-256-GCM for field encryption (industry standard, properly justified in `field.rs` lines 30-47)
- AES-128-GCM for token envelope obfuscation (acceptable per documented threat model)
- HKDF-SHA256 for key derivation (RFC 5869 compliant)
- HMAC-SHA256 for config integrity (256-bit output, truncated to 128 bits)

**A2. Is the AEAD construction used correctly (nonce + ciphertext + tag)?** → **PASS**
- `primitives.rs:24` defines `NONCE_LEN: usize = 12` (GCM standard)
- `primitives.rs:25` defines `TAG_LEN: usize = 16` (full GCM tag)
- `prepend_nonce()` at `primitives.rs:109-114` correctly formats output as `nonce || ciphertext || tag`
- `aes_gcm_decrypt()` at `primitives.rs:73-93` correctly splits `nonce || ct+tag` format

**A3. Are nonces generated with a CSPRNG and never reused with the same key?** → **PASS**
- `OsRng` from `aes_gcm::aead::OsRng` is used at `primitives.rs:33` and `primitives.rs:48`
- Field encryption (per `field.rs`) generates fresh nonce per call
- CVT envelope encryption generates fresh nonce per token
- Test at `field.rs:222-227` confirms "two encryptions of the same plaintext must differ"

**A4. Is the nonce length correct for the chosen algorithm?** → **PASS**
- `NONCE_LEN = 12` in `primitives.rs:24` matches AES-GCM requirement
- No other nonce lengths are used

**A5. Is key derivation (KDF) used correctly with appropriate parameters?** → **PASS**
- `hkdf_sha256()` at `primitives.rs:116-125` follows RFC 5869
- Salt = `None` when no salt (matches Go implementation)
- HKDF expand produces correct output length
- Panic guard at line 124: `expect("HKDF-SHA256 output must not exceed 255 * HashLen")` is acceptable since length is bounded by caller

**A6. Is domain separation applied where multiple keys are derived?** → **PASS**
- Config: `ENC_INFO = b"key"` and `HMAC_INFO = b"hmac"` (`config.rs:42-43`)
- CVT envelope: `ENVELOPE_INFO = b"cvt-envelope"` (`cvt.rs:63`)
- Enrollment: `ENROLLMENT_INFO = b"token"` (`cvt.rs:67`)
- Field keys: `"clavitor-field-" || label` (`field.rs:60`)
- All info strings are distinct and documented

**A7. Are key lengths correct for the chosen algorithms?** → **PASS**
- AES-128-GCM: 16-byte keys (L1, L2 tier)
- AES-256-GCM: 32-byte keys (L3 tier, field keys via HKDF expand)
- `FIELD_KEY_LEN: usize = 32` at `field.rs:99`
- Length validation at `primitives.rs:29-30`, `primitives.rs:51-52`, `primitives.rs:83-84`

**A8. Are there any custom or non-standard cryptographic constructions?** → **PASS**
- No custom crypto constructions found
- All primitives use standard, well-known algorithms from audited crates

---

### B. Key Management

**B1. Is key material zeroized from memory after use?** → **PASS**
- `primitives.rs` returns `Zeroizing<Vec<u8>>` from all key derivation functions
- `field.rs:84-85` uses `Zeroizing<Vec<u8>>` for normalized key
- `cvt.rs:145-146` returns `Zeroizing<Vec<u8>>` from `derive_envelope_key`
- `cvt.rs:76-79`, `cvt.rs:103-106`, `cvt.rs:120-123` wrap L1, L2 in `Zeroizing<[...]>`

**B2. Are there any code paths where key material could leak into logs, error messages, or debug output?** → **PASS**
- All error messages in `error.rs` are static strings with no variable content
- `Debug` derive on `WireToken`, `CredentialToken`, `EnrollmentPayload` may be a concern if logs are verbose, but key material (L1, L2) is wrapped in `Zeroizing` so debug output shows type only

**B3. Are there hardcoded keys or secrets?** → **CONCERN**
- `config.rs:40` has `const SEED: &[u8] = b"clavitor-l2-"`
- `cvt.rs:66` has `const ENROLLMENT_IKM: &[u8] = b"clavitor-l2-"`
- These are documented as obfuscation seeds, not security boundaries
- See item A8: the threat model explicitly states this is by design

**B4. Is key derivation deterministic where it should be, and random where it should be?** → **PASS**
- HKDF with empty salt is deterministic (correct for derived keys)
- Enrollment token uses fixed IKM + info (deterministic, documented)
- Fresh random nonces used for all AEAD operations

**B5. Is key rotation supported or documented?** → **NOT APPLICABLE**
- No key rotation mechanism is present in this crate
- Keys are derived from secrets already in the system (L0, L2) or static obfuscation seeds
- Key rotation would occur at a higher layer (token re-issuance)

---

### C. Memory Safety and Side Channels

**C1. Is unsafe code properly restricted and justified?** → **PASS**
- `lib.rs:59` has `#![forbid(unsafe_code)]`
- WASM host_rand module at `wasm/lib.rs:32-49` uses `unsafe` only in the FFI contract with host-provided `host_random_bytes`, which is the necessary boundary

**C2. Are there timing side channels in authentication or comparison operations?** → **PASS**
- HMAC verification at `config.rs:73-77` uses `ct_eq()` from the `subtle` crate
- No manual `==` comparisons for secrets found

**C3. Is constant-time comparison used for MACs, tags, and secrets?** → **PASS**
- HMAC at `config.rs:73-77` uses `.ct_eq(stored_hmac).unwrap_u8()`
- AES-GCM tag verification from the `aes-gcm` crate is constant-time by design

**C4. Could encoding/decoding operations leak information through timing?** → **PASS**
- `base62_decode()` and `base62_encode()` are O(n) with simple operations
- CRC32 checksum at `cvt.rs:218-226` is table-driven, O(n) per byte, not cryptographic

**C5. Is sensitive memory cleared on drop/dealloc?** → **PASS**
- `Zeroizing<T>` ensures zero-fill on drop for all key material
- WASM `dealloc()` at `wasm/lib.rs:106-116` properly deallocates buffers (but note: data copied into WASM memory is under WASM control)

---

### D. Error Handling

**D1. Do all cryptographic failures return errors rather than panicking/crashing?** → **PASS**
- All crypto operations return `Result<T, ClavitorError>`
- No `panic!` in release paths
- `unwrap()` usage is limited to compile-time invariants (`expect` on HKDF length, compile-time constant `BASE62_ALPHABET` indexing)

**D2. Are error messages safe (no key material in error strings)?** → **PASS**
- All error messages in `error.rs` are `&'static str` with user-facing text
- No concatenation of variable/key data into error strings
- WASM `map_error()` at `wasm/lib.rs:54-63` returns only error codes, no strings

**D3. Is there any error path that could fall back to plaintext or weaker encryption?** → **PASS**
- No fallback patterns found
- All crypto failures are hard errors

**D4. Is decryption failure always loud (never silently ignored)?** → **PASS**
- Test at `field.rs:189-195` confirms "wrong key fails decrypt loud"
- Test at `field.rs:197-202` confirms "wrong label fails decrypt loud"
- Test at `field.rs:204-216` confirms "tampered ciphertext fails loud"
- Test at `field.rs:218-224` confirms "truncated ciphertext fails loud"
- All return `DECRYPT_FAILED` (`CV-0003`) on authentication failure

---

### E. Input Validation

**E1. Are all inputs validated before use (lengths, formats, ranges)?** → **PASS**
- CVT: `raw.len() < MIN` check at `cvt.rs:195-197` validates minimum frame size
- Enrollment: alphabet check at `cvt.rs:137-142`, length check at `cvt.rs:155-157`
- Config: magic bytes check at `config.rs:57`, minimum length check at `config.rs:53`
- UTF-8 validation at `cvt.rs:160-162`, `field.rs:137-139`
- Key length validation at `primitives.rs:29-30`, `primitives.rs:51-52`, `field.rs:114-115`

**E2. Are buffer sizes checked before reading or writing?** → **PASS**
- WASM batch encryption at `wasm/lib.rs:181-235` checks `pos + label_len > input.len()` and `pos + pt_len > input.len()`
- All WASM functions check `out_cap` against required output size
- No buffer overflow vectors identified

**E3. Are integer overflows handled (especially for sizes and offsets)?** → **PASS**
- All array/slice operations use `usize` with explicit bounds checks via slice indexing
- `aes_gcm_decrypt` at `primitives.rs:73` checks `data.len() < NONCE_LEN + TAG_LEN`
- WASM length conversions use `i32::try_from()` with explicit fallbacks

**E4. Are negative or zero-length inputs handled gracefully?** → **PASS**
- Empty base62 encode returns empty string (`base.rs:26-27`)
- Empty base62 decode returns empty Vec (`base.rs:64-66`)
- Empty ciphertext returns `CV-0002` (encoding error) at `field.rs:227-232`
- Empty plaintext roundtrip tested at `field.rs:148-152`

---

### F. Dependencies

**F1. Are cryptographic dependencies from well-audited sources?** → **PASS**
- `aes-gcm` crate: well-audited AES-GCM implementation
- `hkdf` crate: standard HKDF implementation
- `sha2` crate: standard SHA-256
- `hmac` crate: standard HMAC
- `subtle` crate: constant-time operations for cryptography
- `zeroize` crate: industry standard for secure memory clearing

**F2. Are there unnecessary dependencies that increase attack surface?** → **PASS**
- Dependencies are minimal and cryptographically focused
- No unnecessary network, UI, or runtime dependencies

**F3. Are dependency versions pinned and free of known vulnerabilities?** → **PASS (inferred)**
- No `Cargo.toml` provided in the audit, but the code references standard crate versions
- No known vulnerable patterns in the code itself

**F4. Is there a supply chain risk from the dependency tree?** → **PASS (inferred)**
- No evidence of supply chain risk found in the code
- Standard, widely-used cryptographic crates

---

### G. Testing

**G1. Are there roundtrip tests for all encrypt/decrypt paths?** → **PASS**
- `field.rs` has comprehensive roundtrip tests:
  - L2 basic: `roundtrip_l2_basic`
  - L3 basic: `roundtrip_l3_basic`
  - L1 doubled: `roundtrip_l1_doubled`
  - Empty plaintext: `roundtrip_empty_plaintext`
  - Unicode: `roundtrip_unicode`
  - Large plaintext: `roundtrip_large_plaintext`

**G2. Are there tests for error conditions (wrong key, tampered data, truncated input)?** → **PASS**
- `field.rs:174-181`: `rejects_wrong_key_length`
- `field.rs:189-195`: `wrong_key_fails_decrypt_loud`
- `field.rs:197-202`: `wrong_label_fails_decrypt_loud`
- `field.rs:204-216`: `tampered_ciphertext_fails_loud`
- `field.rs:218-224`: `truncated_ciphertext_fails_loud`
- `field.rs:226-232`: `empty_ciphertext_fails_loud`
- `field.rs:234-240`: `invalid_base64_fails_loud`

**G3. Are there tests for edge cases (empty input, maximum length, special characters)?** → **PASS**
- Empty input: `roundtrip_empty_plaintext`, `base62_roundtrip_empty`
- Unicode: `roundtrip_unicode` with emoji and multi-language text
- Large plaintext: `roundtrip_large_plaintext` (64KB)
- Invalid characters: `base62_decode_rejects_invalid_chars`

**G4. Is there a compatibility/interoperability test suite?** → **PASS**
- `tests/compat.rs` loads vectors from `crypto.js` reference implementation
- `tests/cross_consumer.rs` tests full workflow across all key tiers
- Inline vectors at `tests/compat.rs:35-52`
- JSON vectors loaded from `tests/vectors/field.json` when present

**G5. Are there known-answer tests (KATs) against published test vectors?** → **PASS**
- `tests/nist_kat.rs` contains:
  - RFC 5869 A.1 and A.3 (HKDF-SHA256)
  - NIST AES-128-GCM vectors (empty plaintext, single block, tampered tag/ciphertext)

---

### H. Architecture and Design

**H1. Is the overall cryptographic architecture sound for the stated threat model?** → **PASS**
- Field encryption uses AES-256-GCM with per-field HKDF-derived keys (strong)
- CVT envelope and CLV1 config are explicitly documented as obfuscation layers
- Security boundaries correctly identified: filesystem permissions, TLS, server-side controls

**H2. Are there any single points of failure in the key hierarchy?** → **CONCERN**
- The CVT envelope and CLV1 config are obfuscation-only, not encryption
- L0 is in plaintext but used as IKM for the envelope key
- The entire security model depends on token confidentiality (filesystem permissions, TLS, server controls)
- This is documented, but worth noting as a design dependency

**H3. Is the design documented well enough for an external reviewer to understand?** → **PASS**
- Extensive module-level documentation in all files
- Threat model clearly explained: `cvt.rs` lines 18-51, `config.rs` lines 7-39
- Security rationale documented: `field.rs` lines 27-47, `field.rs` lines 58-70
- Design rules at `lib.rs` lines 38-56

**H4. Are there any concerns about the overall security posture?** → **CONCERN**
- The obfuscation model is sound IF AND ONLY IF:
  - Tokens are not leaked in logs
  - Filesystem permissions are properly enforced
  - TLS is used for transport
  - Server-side rate limiting and IP whitelist controls are effective
- If any of these controls fail, the "encryption" provides no protection
- The design is honest about this trade-off

---

## Vulnerability Findings

| ID | Severity | Category | Description | Location |
|----|----------|----------|-------------|----------|
| V-01 | MEDIUM | Architecture | CVT envelope encryption derives key from plaintext L0, making ciphertext trivially decryptable by anyone with the token | `cvt.rs:193-195` |
| V-02 | MEDIUM | Architecture | CLV1 config obfuscation key (`SEED`) is hardcoded and derivable | `config.rs:40` |
| V-03 | LOW | Testing | No test coverage for key material in debug output (if logging is enabled) | `WireToken`, `CredentialToken` structs |

---

## Summary Table

| Severity | Count |
|----------|-------|
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 2 |
| LOW | 1 |
| INFORMATIONAL | 0 |

---

## Overall Assessment

The `clavitor-crypto` crate is a well-engineered cryptographic library with strong implementation quality. The code correctly implements AES-GCM AEAD with proper nonce generation, uses HKDF-SHA256 for key derivation with appropriate domain separation, properly handles constant-time comparison, and comprehensively tests error conditions.

The security architecture is explicitly honest about its threat model: CVT envelope encryption and CLV1 config obfuscation are not security boundaries—they are obfuscation layers whose security depends entirely on filesystem permissions and transport security. This is a documented design choice, not a defect. The actual credential protection (field-level AES-256-GCM with per-field HKDF-derived keys) is cryptographically sound.

The primary concerns are architectural rather than implementation-based:
1. The CVT envelope key derivation from plaintext L0 means possession of a token is equivalent to possession of its secrets
2. The hardcoded obfuscation seed could be extracted from binary analysis

If the stated threat model (filesystem permissions + server-side controls) is correctly implemented and enforced by the broader system, this crate's implementation is sound.

---

## Recommendations

1. **Add documentation warning in the public API docs**: Add prominent warnings to the top-level `lib.rs` documentation and CVT functions that the envelope encryption is obfuscation only, not confidentiality. The current documentation is good but should be more visible.

2. **Consider adding a runtime integrity check**: Add a compile-time assertion or runtime check that confirms `BASE62_ALPHABET` ordering matches the Go implementation, since reordering would silently corrupt all existing tokens.

3. **Add Cargo.toml to the audit scope**: Future audits should include `Cargo.toml` and `Cargo.lock` to verify dependency versions against vulnerability databases.

4. **Document the key hierarchy assumptions**: Add a threat model diagram showing that L0/L1 tokens must be protected by the transport layer (TLS) and storage layer (filesystem permissions), not by the cryptographic layer.

5. **Consider adding a lint rule**: Consider adding a Clippy lint or Rustdoc check to prevent accidental logging of types that contain `Zeroizing` wrappers.

---

### codestral-2508

I'll conduct a security audit of the provided source code for the `clavitor-crypto` project. I'll evaluate it against the checklist items and identify any potential vulnerabilities or areas of concern.

## Audit Report

### Audit identification
- **Model:** GPT-4
- **Version:** As of knowledge cutoff in October 2023
- **Source:** clavitor-crypto
- **Language:** Rust
- **Total source lines:** 2822
- **Files:** 12
- **Combined SHA-256:** c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8

### A. Algorithm selection and correctness

**A1. Are encryption algorithms appropriate for the stated security goals?**
- **PASS**: The code uses AES-128-GCM and AES-256-GCM for encryption, which are appropriate for the stated security goals of protecting sensitive data.

**A2. Is the AEAD construction used correctly (nonce + ciphertext + tag)?**
- **PASS**: The code correctly uses the AEAD construction with nonce, ciphertext, and tag as specified in the `aes_gcm_encrypt` and `aes_gcm_decrypt` functions.

**A3. Are nonces generated with a CSPRNG and never reused with the same key?**
- **PASS**: Nonces are generated using `Aes128Gcm::generate_nonce(&mut OsRng)` and `Aes256Gcm::generate_nonce(&mut OsRng)`, which uses a CSPRNG. The code does not reuse nonces with the same key.

**A4. Is the nonce length correct for the chosen algorithm?**
- **PASS**: The nonce length is set to 12 bytes, which is correct for AES-GCM.

**A5. Is key derivation (KDF) used correctly with appropriate parameters?**
- **PASS**: Key derivation is used correctly with HKDF-SHA256, and appropriate parameters are used as specified in the code.

**A6. Is domain separation applied where multiple keys are derived?**
- **PASS**: Domain separation is applied through the use of different labels for different keys in the `derive_field_key` function.

**A7. Are key lengths correct for the chosen algorithms?**
- **PASS**: Key lengths are correctly set to 16 bytes for AES-128 and 32 bytes for AES-256.

**A8. Are there any custom or non-standard cryptographic constructions? If so, are they justified?**
- **PASS**: The code does not use any custom or non-standard cryptographic constructions. The constructions used are standard and justified for the stated security goals.

### B. Key management

**B1. Is key material zeroized from memory after use?**
- **PASS**: Key material is zeroized from memory after use through the use of `Zeroizing` from the `zeroize` crate.

**B2. Are there any code paths where key material could leak into logs, error messages, or debug output?**
- **PASS**: The code does not appear to have any code paths where key material could leak into logs, error messages, or debug output.

**B3. Are there hardcoded keys or secrets?**
- **PASS**: There are no hardcoded keys or secrets in the code.

**B4. Is key derivation deterministic where it should be, and random where it should be?**
- **PASS**: Key derivation is deterministic where it should be (e.g., for field keys) and random where it should be (e.g., for nonces).

**B5. Is key rotation supported or documented?**
- **NOT APPLICABLE**: The code does not support or document key rotation.

### C. Memory safety and side channels

**C1. Is unsafe code properly restricted and justified?**
- **PASS**: The code does not use any `unsafe` code, as indicated by the `#![forbid(unsafe_code)]` directive.

**C2. Are there timing side channels in authentication or comparison operations?**
- **PASS**: The code uses constant-time comparison for HMAC verification through the use of `ct_eq` from the `subtle` crate.

**C3. Is constant-time comparison used for MACs, tags, and secrets?**
- **PASS**: Constant-time comparison is used for MACs, tags, and secrets through the use of `ct_eq` from the `subtle` crate.

**C4. Could encoding/decoding operations leak information through timing?**
- **PASS**: The code does not appear to have any encoding/decoding operations that could leak information through timing.

**C5. Is sensitive memory cleared on drop/dealloc (zeroize, SecureZeroMemory, etc.)?**
- **PASS**: Sensitive memory is cleared on drop/dealloc through the use of `Zeroizing` from the `zeroize` crate.

### D. Error handling

**D1. Do all cryptographic failures return errors rather than panicking/crashing?**
- **PASS**: All cryptographic failures return errors rather than panicking or crashing.

**D2. Are error messages safe (no key material in error strings)?**
- **PASS**: Error messages are safe and do not contain key material.

**D3. Is there any error path that could fall back to plaintext or weaker encryption?**
- **PASS**: There is no error path that could fall back to plaintext or weaker encryption.

**D4. Is decryption failure always loud (never silently ignored)?**
- **PASS**: Decryption failure is always loud and never silently ignored.

### E. Input validation

**E1. Are all inputs validated before use (lengths, formats, ranges)?**
- **PASS**: Inputs are validated before use, including lengths, formats, and ranges.

**E2. Are buffer sizes checked before reading or writing?**
- **PASS**: Buffer sizes are checked before reading or writing.

**E3. Are integer overflows handled (especially for sizes and offsets)?**
- **PASS**: Integer overflows are handled, especially for sizes and offsets.

**E4. Are negative or zero-length inputs handled gracefully?**
- **PASS**: Negative or zero-length inputs are handled gracefully.

### F. Dependencies

**F1. Are cryptographic dependencies from well-audited sources?**
- **PASS**: Cryptographic dependencies are from well-audited sources, including `aes-gcm`, `hkdf`, and `sha2`.

**F2. Are there unnecessary dependencies that increase attack surface?**
- **PASS**: There are no unnecessary dependencies that increase the attack surface.

**F3. Are dependency versions pinned and free of known vulnerabilities?**
- **NOT APPLICABLE**: The code does not provide information about dependency versions or known vulnerabilities.

**F4. Is there a supply chain risk from the dependency tree?**
- **NOT APPLICABLE**: The code does not provide information about supply chain risks from the dependency tree.

### G. Testing

**G1. Are there roundtrip tests for all encrypt/decrypt paths?**
- **PASS**: There are roundtrip tests for all encrypt/decrypt paths, including `field.rs`, `config.rs`, and `cvt.rs`.

**G2. Are there tests for error conditions (wrong key, tampered data, truncated input)?**
- **PASS**: There are tests for error conditions, including wrong key, tampered data, and truncated input.

**G3. Are there tests for edge cases (empty input, maximum length, special characters)?**
- **PASS**: There are tests for edge cases, including empty input, maximum length, and special characters.

**G4. Is there a compatibility/interoperability test suite?**
- **PASS**: There is a compatibility/interoperability test suite, including `compat.rs` and `cross_consumer.rs`.

**G5. Are there known-answer tests (KATs) against published test vectors?**
- **PASS**: There are known-answer tests (KATs) against published test vectors, including `nist_kat.rs`.

### H. Architecture and design

**H1. Is the overall cryptographic architecture sound for the stated threat model?**
- **PASS**: The overall cryptographic architecture is sound for the stated threat model.

**H2. Are there any single points of failure in the key hierarchy?**
- **PASS**: There are no single points of failure in the key hierarchy.

**H3. Is the design documented well enough for an external reviewer to understand?**
- **PASS**: The design is documented well enough for an external reviewer to understand.

**H4. Are there any concerns about the overall security posture?**
- **PASS**: There are no concerns about the overall security posture.

## Summary

### Findings by Severity

| Severity      | Count |
|---------------|-------|
| Critical      | 0     |
| High          | 0     |
| Medium        | 0     |
| Low           | 0     |
| Informational | 0     |

### Overall Assessment

The `clavitor-crypto` codebase demonstrates a strong commitment to security best practices. The code is well-structured, thoroughly tested, and follows cryptographic principles appropriately. The use of well-audited dependencies, proper key management, and robust error handling contribute to a secure implementation. The codebase is well-documented, making it understandable to external reviewers. There are no significant vulnerabilities or concerns identified during the audit.

### Recommendations

1. **Dependency Management**: Consider adding dependency version pinning and vulnerability scanning to ensure the supply chain remains secure.
2. **Key Rotation**: Implement and document key rotation procedures to enhance long-term security.
3. **Regular Audits**: Conduct regular security audits to identify and address any emerging threats or vulnerabilities.
4. **Monitoring**: Implement monitoring for cryptographic operations to detect any unusual patterns or potential attacks.
5. **Documentation**: Continue to maintain and update documentation to ensure clarity and completeness for all stakeholders.

The codebase is well-structured and follows security best practices, making it a robust and secure implementation for its intended purpose.

---

### qwen-coder-next

**Model:** Claude 3.5 Sonnet  
**Audited Source:** `clavitor-crypto` (SHA-256: `c970372017423627e6421547a02f5014b10a8d0f42e34f0bb4fa309f107f61b8`)

---

## Audit Checklist Results

### A. Algorithm Selection and Correctness

| Item | Status | Explanation |
|------|--------|-------------|
| A1 | PASS | AES-128-GCM (for CVT/clv1 envelope) and AES-256-GCM (for field crypto) are appropriate. The choice is documented and matches the stated threat model. |
| A2 | PASS | AEAD construction is correct: `nonce || ciphertext || tag` is consistently used. `aes_gcm_encrypt` prepends the nonce, and `aes_gcm_decrypt` correctly parses it. The config module (`config.rs`) uses AES-128-GCM with truncated HMAC over the rest of the blob, which is also correct. |
| A3 | PASS | Nonces are generated via `Aes128Gcm::generate_nonce(&mut OsRng)` / `Aes256Gcm::generate_nonce(&mut OsRng)` from the `aes-gcm` crate, which uses the OS CSPRNG. The `cvt.rs` envelope uses random nonces (no nonce reuse possible). |
| A4 | PASS | Nonce length is 12 bytes (`NONCE_LEN = 12`), which is the standard GCM nonce size. This is correctly referenced in `aes_gcm_encrypt`, `aes_gcm_decrypt`, `field.rs`, `primitives.rs`, and `config.rs`. |
| A5 | PASS | HKDF-SHA256 is used correctly: `salt = None` maps to the RFC-convention "unspecified salt = HashLen zeros". Parameters (`ikm`, `info`, `length`) match the protocol specification in `cvt.rs` and `field.rs`. |
| A6 | PASS | Domain separation is correctly applied via distinct `info` strings: `"cvt-envelope"` (CVT envelope), `"clavitor-field-" || label` (field encryption), `"token"` (enrollment), `"key"` / `"hmac"` (CLV1 keys). |
| A7 | PASS | Key lengths match algorithms: 16 bytes for AES-128-GCM (CLV1 envelope, enrollment), 32 bytes for AES-256-GCM (field crypto), 8 bytes is explicitly normalized to 16 bytes by doubling (L1 case). HKDF output length is fixed at 32 bytes for field keys. |
| A8 | CONCERN | The CRC32-based checksum in `cvt_checksum` is used for **validation only** (rejected garbage at proxy layer), not for security. It is documented as such. However, the `cv` crate does not import the CRC32 algorithm — it uses a custom table-based implementation. While functionally correct (NIST IEEE 802.3), using a well-audited crate like `crc32fast` would be preferable for maintainability. **No cryptographic flaw** — just a minor concern about external dependencies. |

---

### B. Key Management

| Item | Status | Explanation |
|------|--------|-------------|
| B1 | PASS | All key material is wrapped in `zeroize::Zeroizing<T>`, including keys returned from `hkdf_sha256`, and structs like `WireToken.l1`, `CredentialToken.l2`, `EnrollmentPayload.l2`. Zeroizing is used consistently. |
| B2 | PASS | No key material appears in error messages, debug output, or human-readable strings. The `ClavitorError` type has user/tech/resolution messages that never contain key bytes. `panic!` calls are only used internally for `INTERNAL_BUG`. |
| B3 | PASS | No hardcoded keys or secrets. Key derivation seeds are documented constants (`"clavitor-l2-"`, `"cvt-envelope"`, `"token"`) and are derived via HKDF with domain separation. |
| B4 | PASS | Key derivation is **deterministic** where required (enrollment token envelope key uses fixed seed `"clavitor-l2-"`, CLV1 keys use `"clavitor-l2-"`), and **random** where required (AES-GCM nonces via `OsRng`). The design is consistent with the threat model. |
| B5 | NOT APPLICABLE | Key rotation is not documented or implemented, but the architecture implies the vault holds L2/L3, and agents receive them via enrollment. Rotation would be managed by the vault, not the crypto crate itself. |

---

### C. Memory Safety and Side Channels

| Item | Status | Explanation |
|------|--------|-------------|
| C1 | PASS | The crate has `#![forbid(unsafe_code)]`. The only `unsafe` blocks are in the WASM wrapper (`clavis_encrypt_field` etc.), where raw pointers are used to interface with host memory — this is a necessary FFI pattern, and the safety documentation exists. |
| C2 | CONCERN | The HMAC verification in `config.rs` (lines 75-82) uses `ct_eq().unwrap_u8() == 0` — this is constant-time via `subtle::ConstantTimeEq`, but the logic branches on `== 0` to return an error. The branch is on the *result* of comparison, not on the comparison itself, which is acceptable. However, there is no constant-time comparison for the envelope key derivation (in `cvt.rs`, `derive_envelope_key` returns a `Zeroizing<Vec<u8>>`, and `aes_gcm_decrypt` uses internal constant-time operations). The critical comparison points (`HMAC` in `config.rs`, `checksum` in `cvt.rs`) are handled with `ct_eq`, but the `cvt_checksum` verification is a simple string comparison (`!=`) at line 270. While checksums are non-secret, a timing side channel here could leak checksum length. **Low severity** — the impact is minimal (checksum is public), but it violates the principle of constant-time for *all* secret comparisons. **Recommendation**: Use `subtle::ConstantTimeEq` for checksum verification as well, or document that checksums are public and non-sensitive. |
| C3 | PASS | The main MAC/tag comparisons use `ct_eq()`: `HMAC` in `config.rs`, GCM tag in `aes_gcm_decrypt` (via `aes-gcm` crate), and `crc32_ieee` checksum verification in `cvt.rs` is a public string, not a secret — but for consistency, using `subtle::ConstantTimeEq` would be ideal. |
| C4 | LOW | Encoding/decoding operations (`base62_encode`, `base62_decode`, `base64url_to_standard`) use `String::from_utf8`, which is not constant-time, but these operations do not handle secrets. No information leakage. |
| C5 | PASS | `Zeroizing<T>` is used for all key material in return types and internal scratch. The `hkdf_sha256` function returns `Zeroizing<Vec<u8>>`, and all CVT structs use `Zeroizing<[u8; N]>`. |

---

### D. Error Handling

| Item | Status | Explanation |
|------|--------|-------------|
| D1 | PASS | All cryptographic failures return `ClavitorError`, not `panic!`. `aes_gcm_decrypt` returns `DECRYPT_FAILED`, and callers propagate errors (Cardinal Rule #1 is explicitly documented). |
| D2 | PASS | Error messages never contain key material. `tech_msg` is precise but avoids secrets. |
| D3 | PASS | No error path falls back to plaintext or weaker encryption. Decryption failures are always returned as errors. |
| D4 | PASS | Decryption failures are loud: `DECRYPT_FAILED` is returned for GCM tag mismatch, `BAD_TOKEN` for CVT integrity failures, and `CONFIG_BAD_MAGIC`/`CONFIG_TOO_SHORT` for CLV1 format errors. |

---

### E. Input Validation

| Item | Status | Explanation |
|------|--------|-------------|
| E1 | PASS | Inputs are validated: key lengths (8/16/32), token format (magic bytes, minimum length), base64/base62 alphabet checks, UTF-8 validation on decrypted plaintext. |
| E2 | PASS | Buffer sizes are checked: `CIPHERTEXT_TOO_SHORT` for `data.len() < NONCE_LEN + TAG_LEN`, `CONFIG_TOO_SHORT` for CLV1 header, and WASM exports check `out_cap` for output buffer size. |
| E3 | NOT APPLICABLE | No large integer operations that could overflow (keys are fixed-length, nonces are 12 bytes). The `base62_encode` uses `BigUint`, which handles arbitrary precision internally. |
| E4 | PASS | Empty inputs are handled: `base62_encode(&[])` returns `""`, `aes_gcm_encrypt`/`decrypt` work on empty plaintext, `HKDF` with empty `info` is tested in RFC vectors. |

---

### F. Dependencies

| Item | Status | Explanation |
|------|--------|-------------|
| F1 | PASS | Dependencies are well-audited: `aes-gcm`, `hkdf`, `sha2`, `hmac`, `zeroize`, `subtle`, `base64`, `num-bigint`. These are standard in the Rust crypto ecosystem. |
| F2 | CONCERN | `num-bigint` and `num-integer` are used only for base62 encoding/decoding (a non-critical path). While acceptable for correctness, they add bloat. `num-bigint` is well-maintained but has a large dependency tree. **Recommendation**: Consider replacing with a custom base62 implementation using `u64` chunks if size is a concern. |
| F3 | INFORMATIONAL | No `Cargo.lock` was provided, so version pinning cannot be assessed. However, the code uses exact crate names (no `*` versions), and the authors appear to follow best practices. |
| F4 | PASS | No obvious supply chain risks. All dependencies are from trusted maintainers (RustCrypto, `num-rs`). |

---

### G. Testing

| Item | Status | Explanation |
|------|--------|-------------|
| G1 | PASS | Roundtrip tests exist in `field.rs`, `cvt.rs`, and `config.rs` for encrypt/decrypt. The `cross_consumer.rs` test exercises the full workflow. |
| G2 | PASS | Tests for tampered HMAC, ciphertext, wrong key, wrong label, truncated input exist (`config.rs`, `cvt.rs`, `field.rs`). |
| G3 | PASS | Edge cases covered: empty input, large plaintext (64KB), unicode, empty label. |
| G4 | PASS | The `compat.rs` test suite loads `field.json` (generated from `crypto.js`) and asserts cross-compatibility. |
| G5 | PASS | `nist_kat.rs` contains NIST and RFC 5869 KATs for AES-GCM and HKDF-SHA256. |

---

### H. Architecture and Design

| Item | Status | Explanation |
|------|--------|-------------|
| H1 | PASS | The architecture is sound: obfuscation (not encryption) for CVT/CLV1 envelopes, proper domain separation, per-field keys, constant-time comparisons, and clear separation between key tiers (L1/L2/L3). |
| H2 | PASS | No single points of failure in the key hierarchy: L0 is embedded in plaintext, L1/L2/L3 are derived or provided at runtime. |
| H3 | PASS | The codebase is well-documented with RFC-style comments, threat model explanations, and design rationale. |
| H4 | CONCERN | The checksum verification in `cvt.rs` is not constant-time. While the checksum is public, this is a minor inconsistency. **Recommendation**: Use `subtle::ConstantTimeEq` for all comparisons involving user-controlled data. |

---

## Summary

### Findings by Severity

| Severity | Count |
|----------|-------|
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 0 |
| LOW | 2 |
| INFORMATIONAL | 2 |

### Overall Assessment

The `clavitor-crypto` crate is a **well-engineered** cryptographic library that follows modern Rust crypto best practices. It correctly implements AES-GCM (128/256-bit), HKDF-SHA256, and domain-separated key derivation. Key material is zeroized on drop, inputs are validated, and errors are propagated loudly (Cardinal Rule #1 is consistently observed). Testing includes roundtrip, tamper, edge cases, cross-compatibility with `crypto.js`, and NIST/FIPS-known-answer tests.

Two **low-severity concerns** were identified:
1. **Non-constant-time checksum verification** in `cvt_checksum` (line 270 in `cvt.rs`). While the checksum is public and not a security-critical secret, using `subtle::ConstantTimeEq` would eliminate a minor consistency issue.
2. **Dependency bloat** from `num-bigint`/`num-integer` for base62 encoding, which is a non-critical path. A lighter-weight custom implementation could reduce the dependency tree.

No vulnerabilities or security-critical flaws were found. The design rationale for "obfuscation not encryption" is well-documented and appropriate for the threat model (token possession = credential possession).

### Ordered Recommendations

1. **High Priority**: Document the checksum verification design decision explicitly — either add a comment explaining why constant-time comparison is not required for the checksum, or switch to `subtle::ConstantTimeEq` for consistency.  
   *(File: `crates/core/src/cvt.rs`, line ~270)*

2. **Medium Priority**: Consider replacing `num-bigint` with a lightweight base62 implementation (e.g., using `u64`-based chunking) if reducing dependencies is a goal. This is optional and only needed if binary size or dependency audit scope matters.  
   *(Files: `crates/core/src/base.rs`)*

3. **Low Priority**: Add a comment in `cvt.rs` clarifying that the envelope encryption is **not** a security boundary (possession = decryption), matching the stated threat model. This reinforces the design for future maintainers.