Managed Databases
How Phantom extends data sovereignty to managed database services — the hardest problem in cloud sovereignty, and what we can realistically do about it.
The Problem
Managed databases are the biggest gap in cloud data sovereignty
Phantom protects secrets in transit and in process memory. But application data — customer records, financial transactions, health data, PII — lives in managed databases that the cloud provider fully controls. A CLOUD Act subpoena targeting Cloud SQL or RDS gets everything.
When you use a managed database (Cloud SQL, RDS, Aurora, AlloyDB, Spanner, DynamoDB, Azure SQL), the cloud provider controls every layer:
| Layer | Who Controls It | CLOUD Act Accessible |
|---|---|---|
| Storage layer | Cloud provider | Yes — they own the disks |
| Encryption at rest | Cloud provider (even CMEK) | Yes — they manage the KMS |
| Backups & snapshots | Cloud provider | Yes — they create and store them |
| Database engine | Cloud provider | Yes — they can add logging, dump queries |
| Network layer | Cloud provider | Yes — they can intercept traffic |
| Replication | Cloud provider | Yes — they control replica placement |
Why CMEK Doesn't Help
Customer-Managed Encryption Keys (CMEK) sound sovereign, but they aren't. The key lives in the provider's KMS (Cloud KMS, AWS KMS, Azure Key Vault). The provider's infrastructure performs the encryption/decryption. Under a CLOUD Act order, the provider can be compelled to use the key — they have the technical capability, which is all that matters legally.
The uncomfortable truth
There is no way to use a fully managed database and have complete data sovereignty. The "managed" part means the provider has access. Every approach below involves trade-offs — the question is which trade-offs are acceptable for your threat model.
Phantom's Approach: DB Proxy Sidecar
The strongest protection Phantom can offer for managed databases is application-level encryption via an injected database proxy sidecar — architecturally identical to the existing secret injection pattern:
┌─────────────────────────────────────────────────────────┐
│ Pod (injected by Phantom mutating webhook) │
│ │
│ ┌──────────────┐ ┌────────────────────────────────┐ │
│ │ App Container │ │ Phantom DB Proxy Sidecar │ │
│ │ │───▶│ • Encrypts on write │ │
│ │ connects to │ │ • Decrypts on read │ │
│ │ localhost: │ │ • Keys from EU OpenBao │ │
│ │ 5432/3306 │ │ • Field-level granularity │ │
│ └──────────────┘ └──────────────┬─────────────────┘ │
│ │ │
│ ┌─────────────────────────┐ │ │
│ │ Phantom Secret Sidecar │ │ │
│ │ (existing) │ │ │
│ └─────────────────────────┘ │ │
└─────────────────────────────────────┼───────────────────┘
│ TLS
▼
┌───────────────────────┐
│ Managed Database │
│ (Cloud SQL / RDS) │
│ │
│ Stores CIPHERTEXT │
│ Provider sees blobs │
└───────────────────────┘
┌───────────────────────┐
│ EU-Hosted OpenBao │
│ (Encryption Keys) │
│ │
│ Outside CLOUD Act │
│ jurisdiction │
└───────────────────────┘
How It Works
- Same webhook — the existing Phantom mutating admission webhook injects the DB proxy sidecar alongside the secret sidecar
- Transparent to the app — application connects to
localhost:5432(Postgres) orlocalhost:3306(MySQL). The proxy intercepts the connection. - Field-level encryption — configured per-table, per-column. Only sensitive fields are encrypted. Non-sensitive columns remain queryable as normal.
- Keys from EU OpenBao — encryption keys fetched from the same EU-hosted OpenBao instance. Keys never enter cloud provider infrastructure.
- Managed DB stores ciphertext — the cloud provider (and any CLOUD Act subpoena) gets encrypted blobs for protected fields.
Configuration Example
# Phantom DB proxy annotation on a Deployment
metadata:
annotations:
phantom.cloudcondom.io/db-proxy: "enabled"
phantom.cloudcondom.io/db-target: "cloud-sql-instance:5432"
phantom.cloudcondom.io/db-encrypt: |
customers:
- email # deterministic — allows equality lookups
- phone_number # deterministic
- address # randomized — no query capability
- tax_id # randomized
payments:
- card_number # randomized
- bank_account # randomized
health_records:
- diagnosis # randomized
- prescription # randomized
Encryption Modes & Trade-offs
Not all encryption is equal. The mode determines what database operations remain possible on encrypted fields:
| Mode | Algorithm | Query Support | Security Level | Use Case |
|---|---|---|---|---|
| Randomized | AES-256-GCM | None — insert and retrieve only | Highest — identical values produce different ciphertext | Addresses, medical data, documents |
| Deterministic | AES-256-SIV | Equality (WHERE x = ?), GROUP BY, DISTINCT, JOIN on exact match | Medium — leaks equality patterns (same input = same ciphertext) | Email lookups, foreign keys, deduplication |
| Order-preserving | OPE scheme | Equality + range queries (BETWEEN, >, <), ORDER BY | Lower — leaks ordering information | Date ranges, numeric ranges (use sparingly) |
| Tokenization | Random token mapping | Equality only (via token lookup) | High — no mathematical relationship between token and value | Credit card numbers, SSNs, government IDs |
Be honest with customers about what breaks
Encrypted fields lose database-level functionality. You cannot do LIKE '%search%', full-text search, computed columns, or triggers on encrypted data. Aggregations (SUM, AVG) require decrypting all rows first. This is a fundamental cryptographic limitation, not an engineering gap.
What Works and What Breaks
| Operation | Randomized | Deterministic | Order-Preserving | Unencrypted |
|---|---|---|---|---|
| INSERT / UPDATE | Works | Works | Works | Works |
| SELECT by primary key | Works | Works | Works | Works |
| WHERE column = value | No | Works | Works | Works |
| WHERE column BETWEEN | No | No | Works | Works |
| ORDER BY | No | No | Works | Works |
| LIKE / Full-text search | No | No | No | Works |
| GROUP BY / DISTINCT | No | Works | Works | Works |
| JOIN (equality) | No | Works | Works | Works |
| SUM / AVG / COUNT | No | No | No | Works |
| Indexing | No | Works | Works | Works |
| Database triggers | No | No | No | Works |
| Computed columns | No | No | No | Works |
Recommended Data Classification
Not all data needs the same protection level. The practical approach is a tiered model:
| Tier | Data Examples | Protection | Provider Sees |
|---|---|---|---|
| Tier 1: Critical | PII, financial data, health records, government IDs, authentication credentials, encryption keys | Phantom DB proxy — randomized or deterministic encryption, keys in EU OpenBao | Ciphertext only |
| Tier 2: Sensitive | Business logic data, internal communications, customer behavior, pricing models | Phantom DB proxy — deterministic encryption for key lookup fields, randomized for the rest | Ciphertext (with equality patterns on deterministic fields) |
| Tier 3: Operational | Logs, metrics, feature flags, cache data, public content | Standard cloud encryption (CMEK or default). No Phantom proxy needed. | Plaintext accessible |
Practical guidance
Most applications have 5-15 columns that contain truly sensitive data across all their tables. Encrypting those specific fields gives 80-90% of the sovereignty benefit with minimal query impact. You don't need to encrypt created_at or product_name.
Real-World Example: Financial SaaS
Consider a typical financial services application with a customers table:
| Column | Type | Encryption | Rationale |
|---|---|---|---|
id | UUID | None | Primary key — must be queryable, not sensitive |
email | VARCHAR | Deterministic | PII, but needed for login lookups (WHERE email = ?) |
full_name | VARCHAR | Randomized | PII — display only, no queries needed |
phone | VARCHAR | Randomized | PII — display only |
tax_id | VARCHAR | Randomized | Highly sensitive — never searched in DB |
address | JSONB | Randomized | PII — display and shipping only |
plan_tier | VARCHAR | None | Business data — needed for filtering, not sensitive |
created_at | TIMESTAMP | None | Operational — needed for sorting and reporting |
country_code | CHAR(2) | None | Needed for compliance routing, not sensitive alone |
Result: 4 out of 9 columns encrypted. The application continues to work normally — login lookups, plan filtering, date sorting all function. But a CLOUD Act subpoena for this table returns full_name, phone, tax_id, and address as encrypted blobs.
Alternative Approaches Compared
| Approach | Sovereignty Level | Query Impact | Engineering Effort | Managed DB Compatible |
|---|---|---|---|---|
| Phantom DB Proxy (field-level encryption) | High for protected fields | Moderate — some ops lost | Medium | Yes |
| CMEK (customer-managed keys) | Low — provider manages KMS | None | Low | Yes |
| MongoDB CSFLE | High for protected fields | Moderate | Medium | MongoDB only |
| Azure Always Encrypted (with enclaves) | Medium — Azure manages enclaves | Some ops preserved in enclave | Medium | Azure SQL only |
| Self-managed DB on Confidential VMs | Complete | None | High — you manage everything | No — self-managed |
| EU sovereign cloud database | High (if provider is EU-only) | None | Medium — migration effort | Yes |
Prior Art & Existing Solutions
Application-level database encryption is not a new concept. Several products exist in this space:
| Product | Approach | Status | Differentiation from Phantom |
|---|---|---|---|
| Baffle | Database encryption proxy | Acquired by Aembit (2024) | Standalone product, no K8s webhook integration, no EU key management story |
| CipherTrust (Thales) | Application-level tokenization & encryption | Active — enterprise product | Heavy enterprise licensing, agent-based, not cloud-native |
| MongoDB CSFLE | Client-side field-level encryption with external KMS | Active — built into MongoDB drivers | MongoDB-only. No cross-database support. |
| Azure Always Encrypted | Column encryption with optional secure enclaves | Active — Azure SQL feature | Azure-only. Enclave managed by Microsoft. Vendor lock-in. |
| Virtru / CipherCloud | Data-centric encryption for SaaS | CipherCloud acquired by Lookout | Focused on SaaS apps (Salesforce, O365), not databases |
Phantom's differentiator
None of these combine automatic sidecar injection (zero app code changes for basic mode), EU-jurisdictional key management, and Kubernetes-native deployment in a single product. The DB proxy is a natural extension of the same webhook + OpenBao architecture that already handles secrets.
Phantom Product Tiers
The database proxy positions Phantom as a broader data sovereignty platform, not just a secrets manager:
The Nuclear Option: Self-Managed on Confidential VMs
For maximum protection, skip managed databases entirely:
- Run PostgreSQL / MySQL on Confidential VM nodes (AMD SEV-SNP / Intel TDX) within the cloud
- Phantom manages all database encryption keys via EU OpenBao
- Database TDE (Transparent Data Encryption) uses keys from OpenBao, not cloud KMS
- Backups encrypted with keys the cloud provider cannot access
- Memory encrypted at hardware level — hypervisor cannot inspect queries or data
You lose: automatic patching, built-in HA, managed backups, one-click replicas, connection pooling, query insights. You manage all of this yourself (or via operators like CloudNativePG, Zalando Postgres Operator).
You gain: complete data sovereignty. The cloud provider is reduced to a compute and network layer with zero visibility into your data.
Realistic assessment
Most customers won't want this. The operational burden is significant. Position this as the Phantom Vault tier for regulated industries (banking, healthcare, government) where the compliance requirement justifies the operational cost. For most customers, Phantom Shield (DB proxy with field-level encryption on managed databases) is the right balance.
CLOUD Act Subpoena Scenario
What happens when a US government agency serves a CLOUD Act order to the cloud provider for a customer's database:
| Scenario | What Provider Hands Over | Usable by Government? |
|---|---|---|
| No protection (standard managed DB) | Full database dump — all tables, all rows, plaintext | Fully usable |
| CMEK only | Full database dump — provider decrypts with KMS key they manage | Fully usable |
| Phantom Shield (DB proxy) | Database dump with Tier 1/2 columns as ciphertext. Unencrypted columns readable. Schema visible. | Partially — structure and non-sensitive data visible, but PII/financial data is encrypted blobs. Keys are in EU OpenBao, outside US jurisdiction. |
| Phantom Vault (self-managed on confidential VMs) | Encrypted disk images and encrypted memory snapshots | Not usable — no keys, no plaintext anywhere |
The legal argument
With Phantom Shield, the EU entity (customer) controls the encryption keys via their EU-hosted OpenBao instance. The US cloud provider does not possess the technical capability to decrypt protected fields. Under CLOUD Act, providers can only be compelled to produce data they can access. Ciphertext they cannot decrypt is arguably outside the scope of a production order — though this remains an evolving legal area.
Implementation Considerations
Performance Impact
| Operation | Overhead | Notes |
|---|---|---|
| Encrypt on write | ~0.1-0.5ms per field | AES-256-GCM is hardware-accelerated (AES-NI) |
| Decrypt on read | ~0.1-0.5ms per field | Same — hardware-accelerated |
| Proxy connection overhead | ~1-2ms per query | Local sidecar communication via localhost |
| Key fetch (cached) | ~0ms | Keys cached in-memory by sidecar after initial fetch |
| Key fetch (cold start) | ~10-50ms | One-time per pod startup from EU OpenBao |
| Data size increase | ~30-40% | Ciphertext + IV + auth tag per encrypted field |
Migration Strategy
- Audit — identify sensitive columns across all tables (typically 5-15 per application)
- Classify — assign Tier 1 (critical), Tier 2 (sensitive), or Tier 3 (operational) to each column
- Encrypt in place — run migration that reads plaintext, encrypts via proxy, writes back. Can be done rolling.
- Enable proxy — switch application to connect via Phantom DB proxy. No application code changes for basic mode.
- Verify — confirm encrypted fields return ciphertext when queried directly (bypassing proxy)
Supported Databases
| Database | Protocol | Proxy Feasibility | Notes |
|---|---|---|---|
| PostgreSQL (Cloud SQL, RDS, Aurora, AlloyDB) | PostgreSQL wire protocol | High | Well-documented protocol. pgcrypto for reference. PgBouncer-style proxy pattern proven. |
| MySQL (Cloud SQL, RDS, Aurora) | MySQL wire protocol | High | Well-documented protocol. ProxySQL/Vitess patterns proven. |
| MongoDB (Atlas) | MongoDB wire protocol | Medium | BSON protocol. Native CSFLE already exists — could integrate with OpenBao instead of cloud KMS. |
| DynamoDB | HTTPS/JSON API | Medium | API-based, not wire protocol. AWS SDK-level interception or HTTP proxy. |
| Spanner | gRPC | Medium | gRPC proxy possible but complex. Spanner's distributed nature adds complications. |
| Redis (Memorystore, ElastiCache) | RESP protocol | High | Simple protocol. Encrypt values transparently. Keys remain plaintext for lookups. |
MVP scope: PostgreSQL first
Start with PostgreSQL proxy support. It covers Cloud SQL (GCP), RDS/Aurora (AWS), and Azure Database for PostgreSQL — the three major managed offerings. The wire protocol is mature and well-documented. Expand to MySQL and MongoDB in subsequent releases.
Key Takeaways
Natural product extension
The DB proxy sidecar uses the exact same architecture as Phantom Core — mutating webhook, sidecar injection, EU OpenBao key management. It's not a new product; it's the same product applied to a different problem. This is the strongest argument for building it.
Don't oversell it
Application-level encryption has real trade-offs. Be transparent: some queries break, performance has overhead, schema design requires thought. Customers who need SELECT * FROM customers WHERE name LIKE '%smith%' on encrypted name fields will be disappointed. Position it correctly — this protects the crown jewels, not everything.
Pricing opportunity
Three tiers justify a premium pricing model. Core ($50-100/node/month) for secret injection. Shield ($100-200/node/month) for DB proxy encryption. Vault (custom pricing) for full self-managed sovereignty. The database story turns a single-feature product into a platform.