Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

Sensitive data handling — Member portal ​

Status: Draft. Phase A deliverable per ENG-187. Must be reviewed + signed off before Phase B sections that collect SSN / immigration documents / payment data ship code.

Owner: Taha (founder, CTO-of-record). Reviewer: Asad (CFO, compliance owner).

Scope: every field the member portal collects that, if leaked, would create an identity-theft, fraud, or HIPAA breach risk. Concretely: SSN, immigration document numbers, full DOB, full address, full income detail, payment account / card numbers. The control set below is the floor — Phase B sections may add controls but cannot remove any.

This doc lives under docs/security-compliance/ alongside encryption-policy.md, data-retention-policy.md, and access-control-policy.md. It's referenced from ENG-187 under the "Sensitive data handling" plan section.


1. Storage at rest ​

FieldAt-rest treatmentRationale
SSNCSFLE field-level encryption with KMS-CMK-derived data keys, AES-256-CBC, deterministic algorithm so equality-match queries work for SSA verification re-runsHighest-value identity theft vector. AWS BAA covers KMS. Driver-layer encryption means even direct Atlas queries by app_write return ciphertext
Immigration document numbers (I-551, I-94, I-766, etc.)CSFLE field-level encryption, same key family as SSNSame risk class. SAVE verification re-uses the value
Full DOBCSFLE field-level encryptionCombined with name + ZIP this is a re-identification vector
Full home addressPlaintext in main collection, encrypted backupsAlready widely-handled in agent flows; not a unique re-id vector on its own
Phone, emailPlaintextStandard contact info
Income per source ($amount, frequency, employer name)Plaintext in main collectionMember sees this in their portal as the editable record; FFM submission requires plaintext
Bank routing + account numberCSFLE field-level encryption, separate key from SSN family for blast-radius isolationDirect fraud vector
Card primary account number (PAN)Never stored in member_applications. PAN tokenized via payment vendor (Stripe / Square / chosen Phase B vendor) at moment of entry; we store only vendor-side token + last-4 + brandPCI-DSS: storing PANs requires Level 1 attestation we don't want to assume

CSFLE is enforced at the MongoDB driver layer (autoEncryption with mongocryptd). Application code SETs values as plaintext; driver encrypts on write. Reads through the driver decrypt automatically. Direct Atlas queries (e.g. an Atlas admin running a query in the UI) return ciphertext for encrypted fields. This is the property we want for tamper / leak / insider-access containment.

Key management: KMS-CMK in the prod AWS account, alias alias/prod-member-portal-csfle-master, rotation enabled (annual). Data Encryption Keys (DEKs) are stored in a dedicated __keyVault collection per MongoDB CSFLE convention. DEKs are wrapped by the CMK.

Key rotation cadence: annual rotation of the master CMK (automatic). Re-encryption of existing data is not automatic but is also not required for envelope encryption — only the new wrappings use the new key version. Member-portal application data has a 7-year retention horizon (per data-retention-policy.md); re-encryption job is deferred unless a compromise indicator forces it.

2. Transport ​

  • TLS 1.2 minimum on every hop: viewer → CloudFront, CloudFront → ALB (custom-origin), ALB → ECS task. The ACM cert covering *.askflorence.health is used for both the CloudFront viewer cert AND the origin-side TLS handshake to ALB
  • Internal VPC traffic is TLS too — no plaintext on the wire even within our VPC. ALB → task uses HTTPS (port 443 / certified at the task per ECS service config)
  • MongoDB Atlas connections use TLS 1.2 minimum, certificate-validated

3. Presentation to the member ​

For SSN and immigration document numbers — the two highest-value re-identification fields — apply this rule:

Default presentation: masked. When a portal page renders a previously-captured value, the value is masked: ***-**-1234 for SSN, A123 4567 **** style for document numbers. The full value is never sent to the browser on the standard read path.

Edit path: step-up verification. When the member clicks "edit my SSN" or "edit my immigration document":

  1. The page presents a step-up verification challenge (re-enter password OR re-do magic link OR TOTP if enabled)
  2. On successful challenge, server returns the unmasked value to the browser ONE TIME, in the edit form
  3. After submit (or cancel), the page reverts to masked display
  4. The unmasked-value response includes a Cache-Control: no-store + Clear-Site-Data: "cache" header to prevent retention in browser cache

No "always-visible" mode, ever. Members who want to verify their SSN with a third party (employer, bank) must reveal it through the step-up path.

For DOB: shown in full (it's already on the calculator and on every form section the member filled). For income: shown in full (the member wrote it). For address: shown in full. For payment fields: vendor-tokenized; we only show **** 1234 · Visa (last-4 + brand).

4. Step-up verification before reveal ​

The challenge required to unmask a sensitive field:

PhaseChallenge
MVP-1 (single-factor)Re-enter email + receive magic link + click within 10 min
Phase F (MFA on)Magic link + TOTP / hardware-key challenge

The challenge response token is single-use, 10-min TTL, bound to the specific field reveal (scope: "reveal_ssn" in the JWT claim). Re-using the token for a different reveal action fails server-side validation.

5. Logging discipline ​

Application logs MUST NEVER include sensitive field values. The deny-list is enforced at the structured-logger layer (src/lib/logger.ts — to be created Phase A). Deny-listed property names:

ssn, ssn_last_four, immigration_document_number, immigration_doc_number,
date_of_birth, dob, bank_routing_number, bank_account_number,
card_pan, card_number, card_cvv, card_exp

The logger redacts these property names recursively in any object passed to logger.{info,warn,error} calls — value becomes [REDACTED]. A CI check (Phase B) scans logger.* call sites for hand-built strings that include deny-listed substrings.

Stack traces are stripped of all query, body, headers.cookie, headers.authorization fields before being shipped to CloudWatch / observability backends.

No PHI in error messages returned to clients. Server errors that bubble from validation must use generic copy ("We couldn't save your changes — please try again or contact support").

6. Backup + restore ​

Encrypted Atlas backups remain encrypted at rest. The MongoDB CSFLE master key (KMS-CMK) is also backed up — losing it means permanent loss of all sensitive data.

Dual-control restore: restoring from backup requires (a) Atlas project admin (Taha) + (b) AWS KMS-CMK access (Taha; Asad as backup). Both controls are documented in access-control-policy.md. The restore runbook lives alongside break-glass-root-login.md and atlas-user-provisioning.md — to be created Phase A as member-portal-restore.md in the same runbook directory.

7. Egress controls ​

Sensitive fields NEVER appear in:

  • HubSpot sync — member-portal data has zero HubSpot egress. The HubSpot sync worker explicitly skips the member_applications collection. Codified in src/lib/hubspot-sync.ts allowlist (collection-scoped, not field-scoped, for defense-in-depth)
  • First-party analytics events (OpenPanel) — event names + properties use only sanitized fields (step_key, section_completed, submission_status, bucketed values). NEVER include identity values, SSN, doc numbers, raw income, etc. (ADR 0009)
  • SES email content — email templates pull only the member's first name, the plan name, and the application ID. No SSN, DOB, income, or document numbers in email bodies
  • Outbound webhooks (Phase E+ FFM ack handler) — payloads sanitized via a separate outboundPayloadFilter before send

A CI check scans for hand-built JSON bodies that include sensitive field names.

8. What we present vs what we hide ​

FieldMember can see in portalPattern
Full nameYesAlways visible
Date of birthYesAlways visible
SexYesAlways visible
Home addressYesAlways visible
PhoneYesAlways visible
EmailYesAlways visible
SSNMasked default; step-up to reveal***-**-1234
Immigration doc numberMasked default; step-up to revealA123 4567 ****
Citizenship statusYesAlways visible
Income detail (per source)YesAlways visible
Employer name + EINYesAlways visible
Bank routingMasked***0123
Bank accountMasked default; step-up to reveal*****1234
CardTokenized; last-4 + brand only**** 1234 · Visa

For household members: same rules apply to each member's sensitive fields. The primary applicant can see masked summaries of all household members' fields (since they entered them); step-up reveal is per-member.

9. Audit trail ​

Every read of a sensitive field — even by the member themselves — appends an entry to agent_audit_log:

{
  event: "member_sensitive_field_accessed",
  actor: { type: "member" | "agent" | "system", accountId, sessionId },
  applicationId,
  fieldPath,                    // generic name, never the value
  revealedTo: "member_ui" | "agent_review" | "ffm_submission",
  timestamp
}

Auditor (with audit_reader role) can reconstruct: who saw what, when. The audit log is append-only per ADR 0002 and tamper-evident.

10. Data retention ​

Per data-retention-policy.md, member-portal data retention windows:

StateRetention
Abandoned drafts (no email captured)90 days from last update, then hard-delete
Abandoned drafts (email captured)18 months from last update, then hard-delete
Submitted-but-not-yet-activeThrough coverage year + 7 years (tax retention)
Active memberThrough coverage year + 7 years post-termination
Florence conversation logsSame as parent application doc
agent_audit_log7 years minimum (HIPAA + EDE Year 9 default), 10 years for safety
Backups90 days rolling; encrypted; subject to dual-control restore

Deletion is hard-delete (not soft-delete) at TTL boundary. CSFLE-encrypted fields go to ciphertext-shredding at TTL — the DEK for that range is destroyed, rendering the ciphertext unrecoverable even with the master key.

A right-to-be-forgotten request that comes BEFORE the retention TTL fires triggers an immediate ciphertext-shredding path on the affected document, EXCEPT for the agent_audit_log entries which must be retained for compliance — those rows have the personallyIdentifiable: false invariant (only IDs + event types, no names/SSNs/DOBs).

11. Plan-of-record for Phase B section integration ​

Every Phase B section that collects a sensitive field must:

  1. Tag the field in the per-section Zod schema with .brand<"sensitive">()
  2. Add an entry to SENSITIVE_FIELDS registry in src/lib/portal/sensitive-fields.ts (drives the masking + step-up + logger deny-list)
  3. Write an integration test asserting (a) masked display on read, (b) step-up challenge on edit, (c) audit-log entry on reveal, (d) deny-list scrub in logs
  4. Document the field in a Phase B addendum to this doc

Sections that include sensitive fields (per the canonical 9-section scope):

  • Section 1 (primary applicant identity): SSN
  • Section 2 (household composition): per-member SSN, DOB
  • Section 3 (citizenship / immigration): immigration document numbers
  • Section 9 (payment): bank or card details

Section 4 (income) does NOT collect a directly-sensitive field per this doc's definition (income amount is not a re-identification vector on its own), but it IS subject to the HubSpot-egress block and the first-party-analytics no-PHI rule (OpenPanel: bucketed income only, never raw).

12. Open questions for sign-off ​

  • [ ] Approve the masked-by-default + step-up-reveal pattern for SSN and document numbers, OR opt-in to a different pattern (e.g. "never show, even on edit — always require re-entry")
  • [ ] Approve the 18-month retention for abandoned drafts with captured email — too long? too short?
  • [ ] Approve the dual-control restore designation: Taha + Asad. Need a third on-call backup before Asad is fully onboarded
  • [ ] Approve the payment vendor approach (Stripe / Square / other) — separate vendor decision, but it locks the PCI scope downstream

Sign-off ​

ReviewerRoleStatusDate
Taha AbbasiCTO-of-recordPending—
Asad KhalidCFO / compliance ownerPending—

Once both sign off, link this doc from the Phase A PR and from the ENG-187 Linear issue.

Pager
Next pageHome

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.