Factual Demonstration

The Blind Analysis.

An AI vendor classifies a mammogram without ever seeing the image — or even the embedding extracted from it. This is the real production pattern used by Owkin, Lifebit, Mozaic.

The Scenario

The hospital runs ResNet-50 locally over a 4000×4000 mammogram. The CNN produces a 2048-dimensional embedding. That embedding is encrypted and sent to the AI vendor. The vendor runs only the final linear classifier under encryption.

The Problem

Running a heavy CNN (25M parameters) under FHE is infeasible. But the CNN runs in cleartext locally, and only the final linear classifier needs to run under encryption. That makes the pattern computationally viable and clinically plausible.

The Guarantee

The vendor never sees the original image nor the cleartext embedding. Even if one tried to invert the embedding to reconstruct the image (model inversion attack), the result would be generic and blurry — never the patient's specific mammogram.

Step 01 · Setup

Define the parameters.

Before encrypting the embedding, we choose CKKS parameters. The 2048-dim embedding fits comfortably in a single ciphertext with 8 192 slots.

Capacity · CKKS

2 048Embedding dim (ResNet-50)

8 192Slots/ciphertext

3Mult. depth

~128 bitSecurity

What each term means

CKKS ("approximate" scheme) — FHE family for real numbers. Controlled approximation noise (~10⁻⁹). It is the standard scheme for ML inference over continuous feature vectors — exactly the case of this embedding.

2 048 = ResNet-50 output — ResNet-50 (25M parameters, trained on ImageNet or a clinical dataset) produces a 2 048-activation vector at its penultimate layer. This vector captures all the diagnostically relevant information from an image of any size (224×224 to 4000×4000).

Fits in 1 ciphertext — 2 048 slots occupied out of 8 192 available. 6 144 slots remain empty. CKKS processes everything in parallel with a handful of operations.

Depth 3 — Enough for: 1 Mul (embedding × weights) + 1 InnerSum + 1 scalar Mul + Adds. The final linear classifier only consumes 2-3 levels.

RLWE base · ~128 bits — Same problem as ML-KEM/ML-DSA standardized by NIST as post-quantum cryptography.

Step 02 · Keys

Keys generated on the phone.

The secret key is born on the patient's device and never leaves it. Only the public key (which cannot decrypt on its own) travels to the AI vendor.

What is generated

~50 msTotal time

~7 KBPublic key (pk)

~44 KBRelinearization (rlk)

~MBGalois keys (rotation)

What each key does

sk (secret key) — The only one that can decrypt. It stays locked in the patient app, protected by the iPhone Secure Enclave or the Android Trusted Execution Environment.

pk (public key) — Lets anyone encrypt data to you. It is sent to the vendor. It cannot decrypt — only encrypt. Like a mailbox with a slot: anyone drops in, only you open it.

rlk (relinearization) — Lets the vendor multiply ciphertexts. Without rlk, multiplication is forbidden. With rlk, multiplication is possible without revealing anything.

Galois keys (rotation) — Enable vector-sum operations (InnerSum). Needed for the inner product of the logistic model.

Step 03 · Local extraction

ResNet-50 runs locally in cleartext.

The original mammogram (4000×4000 pixels, ~50 MB in DICOM) is processed by ResNet-50 on the hospital's own device. In cleartext, locally. The CNN is too heavy to run under FHE — so it runs locally. The result is a 2 048-activation embedding.

What happens on the local device

// 1. load the original DICOM image
img := loadDICOM("mammogram.dcm")
// img ~ 4000×4000 pixels, ~50 MB

// 2. pretrained ResNet-50 (IN CLEARTEXT)
model := loadResNet50("checkpoint.pt")
embedding := model.extractFeatures(img)
// embedding = [2048] float vector

// 3. only now encrypt the embedding
pt := encoder.Encode(embedding)
ct := encryptor.Encrypt(pt)

// 4. send ONLY ct to the vendor
send(ct)  // 1 MB

Why this pattern is the right one

Heavy CNN in cleartext, locally — Running a ResNet-50 (25M parameters, conv layers, BatchNorm, ReLU, pooling) under FHE would be orders of magnitude more expensive and infeasible in production. The industry standard solution is: run the CNN in cleartext on the patient/hospital device, where the image is already authorized.

2048-float embedding — The penultimate ResNet-50 layer is a 2048-activation vector after GlobalAveragePool. ~37% of those activations are non-zero (sparse, typical of ReLU). This vector preserves all the diagnostically relevant information for classification.

Not invertible in practice — Inverting ResNet-50 (2048 → original 16M-pixel image) is an open research problem. "Model inversion" attacks only recover generic, blurry images — never the patient's specific mammogram.

8 ms just to encrypt — Encrypting the 2048-dim vector takes ~8 ms on commodity CPU. Extracting the embedding with ResNet-50 would take ~1s on CPU or ~200ms on GPU — but it happens in cleartext, with no FHE overhead.

Step 04 · Transit

What the vendor receives.

The AI vendor receives approximately 1 MB of bytes. Without the secret key (which stayed on the phone), these bytes are indistinguishable from pure noise.

Incoming bytes (real sample)

7b 22 50 6c 61 69 6e
65 78 74 4d 65 74 61
61 74 61 22 3a 7b 22
63 61 6c 65 22 3a 7b
...

1 MBTotal bytes transmitted

Why it is secure

Pseudo-random bytes — Each ciphertext is built by adding a carefully calibrated "noise" polynomial. To any observer without the secret key, it is statistically indistinguishable from pure noise.

No fragile chain of custody — The difference vs today's model: the cleartext image passes through dozens of hops (CDN, load balancer, microservice, database, logs). Each is a leakage point. Under FHE, each hop only sees pseudo-random bytes.

Even if it leaks — If an attacker captures all the vendor's traffic for months, they still cannot extract a single image. The guarantee is mathematical, not based on "hoping no one looks".

Step 05 · Classifier

Linear classifier under encryption.

The vendor runs only the final linear classifier — the model's last dense layer — over the encrypted embedding. The classifier weights are public, trained offline on a labeled clinical dataset (CheXpert, RSNA Mammography, DDSM).

The algorithm · real pseudo-code

// 1. public weights (2048 floats)
weights := loadWeights("classifier.pt")
bias := -0.6

// 2. multiply embedding × weights
ctMul := evaluator.Mul(ctEmb, weights)
evaluator.Rescale(ctMul)

// 3. sum the 2048 products
ctLogit := evaluator.InnerSum(ctMul, 2048)
evaluator.Add(ctLogit, bias)

// 4. sigmoid linearization
evaluator.Mul(ctLogit, 0.2)
evaluator.Add(ctLogit, 0.5)

What each operation does

Element-wise Mul — Multiplies each of the 2048 embedding activations by the corresponding classifier weight. CKKS runs them all in parallel in a single operation.

InnerSum over 2048 slots — Sums the 2048 products to obtain the logit (inner product + bias). Internally it uses log₂(2048) = 11 Galois rotations. It is the most expensive step of the pipeline.

Sigmoid linearization — The real sigmoid is non-linear and expensive in FHE. For a single inference where logits land in the [-2, +2] range, we approximate with a line: p ≈ 0.5 + 0.2·z. Error < 0.03 in the useful region.

79 ms total — Real measured time. Acceptable for non-emergency diagnosis. In production with FPGA/HEXL acceleration, it drops to ~10 ms.

Step 06 · Result

Patient decrypts on the phone.

The result ciphertext goes back to the patient's device. Only there, with the secret key, does the number become readable. The AI vendor never sees this number.

Clinical probability

0.798

SUSPICIOUS FINDING

radiologist review recommended

How to read the result

Score > 0.5 — Suspicious finding. The app tells the patient to seek review by a human radiologist. AI is triage, not diagnosis.

Score < 0.5 — No relevant finding. It can be archived or kept for the next routine screening.

Final decision is always human — The AI model never replaces the radiologist. It is triage support: it helps prioritize suspicious cases in a queue of thousands of exams, reducing time to specialist review in the truly critical ones.

Vendor doesn't even see the score — The result is encrypted on the way back. The vendor only knows that it processed a request. Not even the clinical outcome stays with the vendor.

Step 07 · Validation

FHE vs plaintext.

To prove that the encrypted computation produces exactly the same result as the plaintext one, we recompute the same inference directly over the cleartext embedding and compare.

Numerical comparison

Metric	FHE	Plaintext
Probability	0.798029	0.798029
Error	1.4 × 10⁻⁹
Final decision	IDENTICAL

1.4 × 10⁻⁹Absolute error

Why this precision is enough

9 decimal places — The gap between the CKKS computation and the plaintext computation is on the order of 10⁻⁹. For comparison: the sensitivity/specificity of any clinical classifier is on the order of 10⁻³. 9 decimals of precision is absurdly more than enough.

Robust threshold — Since the decision is binary (probability > 0.5 or not), tiny numerical variations do not flip the final classification. A 10⁻⁹ error would never turn a 0.51 score into 0.49.

Reproducible audit — Any independent auditor can rerun the same operation on the same encrypted embedding and get exactly the same score. Mathematical reproducibility is part of the clinical and regulatory guarantee.

Step 08 · Adversarial

Four layers of protection.

The AI vendor is in a weak position even under adaptive attack. Four distinct layers prevent recovery of the original image.

Layer 1 — Opaque ciphertext

Attempt: the vendor tries to read the encrypted embedding it received.

Defense: without the secret key (on the patient's device), the ciphertext bytes are pseudo-random. Recovering the cleartext embedding would require solving Ring-LWE at N=16 384 — ~2¹²⁸ operations. Infeasible.

Layer 2 — Inverting ResNet-50

Attempt: hypothetically, the vendor obtains the cleartext embedding (imagine a breach). It tries to invert ResNet-50 to reconstruct the image.

Defense: inverting ResNet-50 (2048 → 16 million pixels) is an open research problem. Known "model inversion" attacks (Fredrikson 2015, Zhang 2020) only recover generic, blurry images — never the patient's specific mammogram. For clinical diagnosis, the attack is absolutely insufficient.

Layer 3 — Inferring from the score

Attempt: the vendor tries to infer the image from the final score alone (0.798).

Defense: 1 output number, 2048 features, 16 million input pixels. Massively under-determined system. Information-theoretically allows recovery of ~0 bits of the original image.

Layer 4 — Regulatory compliance

LGPD art. 11 + HIPAA — Health data is a special category. Under traditional architecture, sending a clinical image to a foreign vendor is an international transfer of sensitive data. Under FHE + local embedding, the "transfer" is mathematically null: the vendor receives pseudo-random bytes. The data was technically never shared. This is the pattern used by Owkin in Sanofi-Roche partnerships for oncology imaging.

Step 09 · Summary

Medical AI without seeing the patient.

In under 100 ms, the vendor classified a mammogram without ever having seen the image or the cleartext embedding. This is the pattern used in production by Owkin, Lifebit, Mozaic.

The complete flow

Patient generated keys locally on the device
ResNet-50 ran in cleartext over the 4000×4000 mammogram
2048-dim embedding generated locally
Embedding encrypted before leaving
Vendor received only 1 MB of pseudo-random bytes
Final linear classifier executed under encryption (79 ms)
Encrypted score returned
Patient decrypted and saw the clinical recommendation

Real numbers

8 msEmbedding encryption

79 msEncrypted classifier

1 MBCiphertext

10⁻⁹Precision error

0.798Probability result

2 048Embedding dim

What this unlocksLabs and hospitals can offer diagnostic AI via partnership with foreign vendors without violating LGPD/HIPAA. The image never leaves the patient's control — and not even the embedding is accessible to the vendor.

This is the real production patternOwkin (Paris/NY) runs this model in partnerships with Sanofi and Roche for oncology imaging. Lifebit (UK) applies the same pattern for genomics. Mozaic (USA) for general radiology. FHE over embeddings is the only computationally viable approach for medical AI with mathematical privacy guarantees.

2 eBooks use this primitiveFHE_LABORATORIOS_EBOOK (ch. IV — diagnostic AI under encryption) · FHE_HOSPITAIS_EBOOK (ch. IV — radiological AI without handing the patient to the vendor).