Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

AskFlorence — System Briefing ​

Source: Claude Code session, generated from repo state, commit history, and in-code docs. Repo: github.com/askflorencehealth/ask-florence · Live: askflorence.health · Version: v0.3.2 · As of 2026-04-17

AskFlorence is an AI-assisted healthtech platform for the ACA individual market. Deterministic application logic does the mechanical heavy lifting (pricing, eligibility, subsidy math, plan filtering), and AI is positioned to do what it is actually best at: guiding users through an opaque, jargon-heavy product, educating them in plain language, and acting like a knowledgeable agent sitting next to them. Today the deterministic layer is production-grade and 100% audit-matched against CMS; the AI-assist layer is being built on top of that foundation rather than instead of it.


1. System Overview ​

  • Single Next.js 16.2.3 App Router application (React 19.2.4, Turbopack) on Vercel. Manual deploys via vercel --prod — no auto-deploy, so every production push is intentional.
  • Data plane: MongoDB Atlas M10, HIPAA tier. Collections: plans, zip_county, regions, counties, waitlist.
  • External integrations: CMS Marketplace API (eligibility), CMS QRS API (cached ratings), Resend (transactional email), PostHog (analytics server + client).
  • ~27,244 LOC across 105 source files, 150 commits, 10 active development days (2026-04-06 → 2026-04-17), 582 files changed, 51,037 insertions.
  • AI healthtech positioning: the platform is designed so AI augments a deterministic, audit-validated core. Pricing and eligibility are math, not guesses; AI is reserved for the layer where humans struggle — understanding what they're buying and why.

2. Architecture + Components ​

User-facing flows (shipped) ​

  • Homepage calculator (/) — zip + household → eligibility → subsidized plan cards with before/after pricing.
  • Plans marketplace (/plans) — full browse experience, server-side filtered, 599-LOC marketplace component. Filters by metal, issuer, plan type, deductible, MOOP, star rating. Shares one data pipeline with the homepage.
  • Plan cards — show subsidized premium, CSR-adjusted deductible/MOOP (individual vs family, household-size-aware), QRS star rating, plan ID, issuer, SBC links.
  • Eligibility — federal states proxy CMS Marketplace API; NY calculated fully internally from DFS data.
  • Updates stream (/updates/[slug]) — engineering and product updates as SSG content. Current entries: audit scorecard, "The Week We Matched CMS Byte for Byte," Infrastructure Is Production, Momentum Day, Brand Team Numbers.
  • Agent landing (/agents) — marketing page for licensed agents and agencies.
  • Agent onboarding waitlist (/agent-onboarding) — 6-field waitlist form (the real portal is Phase 5).

API routes ​

  • /api/plans — household plan search, CSR tier derivation, APTC-applied pricing.
  • /api/eligibility — CMS proxy for federal states, internal calculation for NY.
  • /api/counties — zip → county resolution from our DB.
  • /api/waitlist — dual-purpose consumer + agent signup; writes Mongo, sends Resend email, emits PostHog events.

Core library ​

  • fetch-plans.ts — the single pipeline every plan query funnels through. Computes realPrice = max(0, premium − finalAptc) unconditionally.
  • owned-plans.ts (1,006 LOC) — Mongo-doc → CMS-shape transform. searchOwnedPlans() serves any of 31 owned-data states with identical output shape as the CMS API.
  • csr.ts — FPL, Medicaid threshold, CSR tier derivation.
  • utils.ts — normalizePlan, extractCostShare, findSlcspPremium, calculateAptc.
  • constants.ts — STATE_BASED_MARKETPLACES authoritative list.
  • db.ts — lazy Mongo client (prevents build-time connect failures on Vercel).

Content system ​

  • Custom TypeScript-object-based CMS for /updates (not third-party). Entries live in src/lib/updates/index.ts, rendered via SSG. Markdown-like HTML inline. Chosen over a headless CMS because update cadence is high-signal, low-volume, and needs to ship at the same velocity as code.

Stack intentionally stripped down ​

No ORM, no GraphQL, no tRPC, no Redux, no Prisma. Just Mongo driver + Next route handlers + TypeScript. Reduces audit surface, reduces dependency risk, keeps every line of request-handling code reviewable.


3. What Has Been Built (Concrete) ​

  • Eligibility engine — federal (CMS proxy) + NY (internal). FPL, Medicaid thresholds, CSR tiers, APTC all derived server-side.
  • Plan search / pricing — 30 federal states + NY served from our own MongoDB. Age-rated federal premiums with per-state age curves; community-rated NY. 31 states total.
  • CSR cost-share logic — per-plan puf.csrVariants["94"|"87"|"73"|"zero"|"limited"]. Plan card shows before/after deductible + MOOP based on household FPL.
  • Household-aware display — PlanCard distinguishes individual vs family MOOP/deductible.
  • QRS star ratings — CMS Quality Rating System (global, clinical, enrollee, efficiency), 98.1% plan coverage, ingested via scripts/db/ingest-qrs-ratings.js with a JSON cache.
  • PUF augment pipeline — scripts/db/ingest-puf-augment.js reads CMS PUF CSVs and writes a rich puf.* sub-document per plan: CSR variants, URLs (SBC, formulary, brochure), SBC scenarios (Having a Baby, Diabetes, Simple Fracture), plan features, all MOOP/deductible variants, full per-benefit cost-sharing table, network and formulary foreign keys.
  • Waitlist system — consumer + agent flows. Agent form captures role toggle (individual / agency), team size, 6-10 digit NPN validation, phone, email. Writes to waitlist, notifies [email protected].
  • Marketplace browse — /plans page with full filter/sort.
  • Brand system — full SVG + PNG logo set, per-route dynamic OG images, favicon with approved lantern mark.
  • Docs site — docs.askflorence.health (VitePress, email-verification gated). Architecture, security/compliance, data sources, glossary.
  • SOC 2 groundwork — account inventory and vulnerability management policy committed; proactive CVE tracking (Next.js 16.2.2 → 16.2.3 bump for CVE-2026-23869).

4. Data Accuracy Harness — The Unusual Part ​

Five-tier audit system under scripts/audit/ that validates our database against CMS reality. This is what makes the platform viable for AI augmentation — if the deterministic floor wasn't perfect, AI layered on top would amplify errors.

  • Tier 1 (tier-1-zip-county.js) — 18,785 zips, resolves zip → county vs CMS. 100% match, zero mismatches, zero extras.
  • Tier 2 (tier-2-zip-premium.js) — 28,496 zips, cheapest-Silver-per-zip vs CMS. 100% match. The headline audit.
  • Tier 3 (tier-3-scenarios.js) — 10 household scenarios × 199 counties = 1,990 probes. 100%.
  • Tier 4 (tier-4-integrity.js) — internal consistency, no CMS calls. Honors state-specific age curves (e.g. Utah's 1.35–1.45x ratio, legal under 45 CFR 147.102(e)). Zero warnings.
  • Tier 5 (tier-5-plan-ra-sweep.js) — every plan × rating area = 3,357 comparisons. 100%.
  • parity-check.js — 15-check shape and behavior guard (CI-style pre-deploy guard).
  • Shared helpers in scripts/audit/lib/db-helpers.js — authoritative federal vs SBE state sets, separate read/write DB URIs.

5. In Progress / Actively Being Built ​

Distinction matters. The compliance and agent-portal plan is designed in detail in project docs but only the foundations are in code.

Shipped ​

  • /agents marketing page
  • /agent-onboarding waitlist stub

Designed, not yet implemented ​

  • Agent auth — two tiers. Tier 1 (onboarding, survey) = email magic link. Tier 2 (post-activation) = magic link + TOTP via authenticator app. 15-min idle / 8-hr absolute session limits. No SMS (NIST 800-63B compliant).
  • NIPR PDB NPN validation — $1.30 per call at onboarding + monthly alert subscription for actives.
  • ID verification — vendor-adapter pattern (src/lib/agent-identity.ts), vendor TBD (Persona / Stripe Identity / Plaid / Veriff).
  • Super-admin path (/sa-login) — argon2id password + TOTP + IP allowlist (Tailscale). Three factor classes for Taha's admin management.
  • admins collection — DB-backed roles: super_admin / admin / support.
  • agent_audit_log — append-only, 6–10 year retention (HIPAA/EDE safe).
  • Consent sub-document — every email-capturing record versioned with statement text, IP, userAgent, opt-ins.
  • Narrow-scoped Mongo users — app_writer_survey, app_writer_plans, app_writer_agents, app_admin_agents, audit_reader. Today: two users (read + broad-write), broad-write to be deleted pre-production.
  • AWS migration — blocks the whole agent portal because SOC 2 / HIPAA / CMS EDE audits look back at months of operating history, so building PHI flows on current infrastructure would compromise a future audit.
  • /privacy and /terms pages — block the discovery survey.
  • Unsubscribe flow — CAN-SPAM prerequisite.
  • /agent-discovery — 11-screen research survey (Phase 2).
  • Drug formulary UI (Phase C) and provider directory UI (Phase D) — data already ingested (puf.formularyId, puf.networkId), UI not built.
  • Plan detail page (Phase E) — all puf.* data in MongoDB, UI not built.
  • Member portal, admin portal, enrollment (EDE) flow — not yet in scope; today we show plans, we do not enroll.

6. Data Complexity ​

  • CMS PUF (Public Use Files) — multi-GB CSVs: plan-attributes-puf.csv, benefits-and-cost-sharing-puf.csv, rate-PUF. Per-plan, per-variant, per-benefit rows. Normalized into puf.* sub-document.
  • CMS Marketplace API — live eligibility checks.
  • CMS QRS API — quality ratings, cached to local JSON.
  • NY DFS — community-rated rates (loaded via load-ny-2026.js).
  • Geo data — zip → county → rating area, loaded and enriched via load-zip-county.js + enrich-zip-counties.js.
  • Challenges solved in scripts/db/: tobacco rates back-filled; stale zip entries marked sbeRedirect or unsupported (not deleted, preserving history); per-zip Alaska rating-area split; 5 county RA mapping corrections; Cigna NC RA15 rates rebuilt from CMS; graceful handling for 16 stale zip_county entries.

7. AI / Agentic Architecture ​

Philosophy: AI where it adds the most value ​

ACA pricing and eligibility have mechanical ground truth. CMS publishes age curves, family-aggregation rules, APTC formulas, SLCSP benchmarks. A deterministic implementation can be audited to 100% parity (and has been — see Section 4). A language model layered over that math would be strictly worse: slower, more expensive, non-reproducible, and harder to defend in a CMS audit.

So the platform is split cleanly:

  • Deterministic core does the math — eligibility, subsidies, plan filtering, pricing, cost-share calculations. This is the part the user must be able to trust.
  • AI will do the human-shaped work — guiding, explaining, demystifying, teaching, reassuring. This is the part users actually need help with.

Where AI is planned (the highest-leverage slots) ​

  • Guided plan picking — conversational interface that asks the user about their life (do you have a regular doctor, do you take any meds, how often do you use the ER) and translates their answers into deterministic filters. The AI doesn't pick the plan; it helps the user articulate what matters to them, then hands the machine-readable filter set to the deterministic engine.
  • Plain-language plan explanations — every field on a plan card (deductible, MOOP, coinsurance, network tier, formulary tier) is jargon. AI rewrites the plan's cost structure in terms of the user's actual situation ("for a doctor visit, you pay $25 until your deductible — which is $3,000 — is hit").
  • Benefit and SBC summarization — puf.benefitDetails[] and puf.sbcScenarios are already ingested per plan. AI will turn these into "what this plan covers for someone like you."
  • Formulary and network lookups — "is my doctor in this plan's network" and "is my medication covered" as natural-language queries against the (Phase C/D) structured data, with the AI as the interface and the structured data as the source of truth.
  • Agent-like experience — the product promise is "as if an agent is there for you." AI makes that feasible at scale without compromising accuracy, because the numbers always come from the audited engine.
  • Agent discovery survey synthesis (Phase 2) — unstructured qualitative responses from licensed agents, summarized for the ops team.

Where AI is intentionally NOT used ​

  • Pricing. Subsidy math. APTC calculation. SLCSP lookup. CSR tier derivation. Eligibility determination. Plan filtering. Any number a user sees on a plan card.
  • Hallucination risk on these fields is unacceptable — users make five-figure financial decisions based on them, and a CMS EDE audit will reconstruct every calculation.

Cost-optimization posture ​

  • All structured data (plans, benefits, formularies, networks) is already in MongoDB with rich puf.* sub-documents. AI will query this structured data, not re-derive it from raw PUF or CMS APIs at request time.
  • Plan-level facts (premium, deductible, MOOP, copays, star rating, SBC URLs) are fetched once from our DB. AI sees pre-computed facts, not a prompt stuffed with PUF CSV rows. This caps tokens per interaction.
  • Deterministic routing: the app decides when AI is even needed. A user who types a zip and a household size goes straight through the math path with zero LLM calls. AI fires only when the user asks a natural-language question or opts into the guided flow.
  • Cache-friendly prompt structure is planned so per-plan explanations can be generated once per plan-version and reused across users.

AI in the build process today (AI-assisted workflows) ​

  • The codebase itself is built with Claude Code as the implementation partner. Single developer + AI produced ~27k LOC in 10 days, including the full audit harness that validates against CMS.
  • Data-pipeline fixes (Utah age curve, SBE filtering, Cigna NC rate rebuild, county RA corrections) were found and authored in tight human + AI iteration loops.
  • Internal workflows (update authoring, brand asset generation, OG image templates, commit hygiene) lean on AI heavily; user-facing runtime does not.

8. Security + Compliance Design ​

Today ​

  • MongoDB Atlas M10 HIPAA tier, TLS-only, secrets in Vercel env.
  • Two Mongo users (read + broad-write). Broad-write marked for deletion before production launch.
  • No PHI stored today. Waitlist captures PII (email, name, phone, NPN), not PHI.
  • SOC 2 account inventory and vulnerability management policy checked in.
  • Proactive CVE tracking (Next.js 16.2.2 → 16.2.3 for CVE-2026-23869).
  • Per-route OG metadata overrides to prevent default-image leakage.

Designed for the agent platform ​

  • SOC 2 / HIPAA / CMS EDE audit-readiness from day one on AWS. EDE audits look back at months, not just audit day.
  • Append-only agent_audit_log — 6–10 year retention.
  • Tier 1 / Tier 2 session policy — 15-min idle / 8-hr absolute.
  • Vendor BAA matrix — MongoDB Atlas, Resend, AWS, NIPR, ID verify vendor, Cloudflare, PostHog.
  • Consent sub-document versioning on every email-capturing record, so future CRM imports are GDPR-compliant by construction.
  • Duplicate NPN handling — magic link goes to the original email on file, never the submitted email. Rate-limited, audit-logged.

PCI ​

Out of scope today. We do not take payment — carriers pay us PMPM out-of-band. No card handling code in the repo.


9. Build Scope + Speed ​

  • Start: 2026-04-06 (initial commit).
  • Today: 2026-04-17.
  • 12 calendar days, 11 active dev days.
  • 150 commits, 582 file-changes, 51,037 insertions, 5,446 deletions.
  • ~27,244 LOC across 105 TS/TSX/JS source files, plus ~1k LOC of in-repo docs.
  • Single developer (Taha), working with Claude Code as the implementation partner.
  • Visible codebase progression in the commit log: Day 1 = CMS API demo → Day 3 = MongoDB Phase 1 → Day 7 = serve all 31 states from own DB → Day 9 = 100% audit match → Day 10 = agent platform Phase 1 shipped.

10. Unique / Non-Obvious Decisions ​

  • Deterministic floor, AI ceiling. The unusual call: build the audit-validated math engine first, then layer AI on top only where it adds irreplaceable value. Most "AI-first" healthtech goes the other way and ends up defending accuracy problems forever.
  • Shape parity with CMS. mongoDocToCmsPlan transforms our Mongo docs into the exact CMS API Plan shape. The frontend can't tell where a plan came from. That is what made the 30-states-from-own-DB migration invisible to the UI, and it's what will make AI features portable across data sources.
  • One pipeline for all plan queries. Homepage calculator and /plans marketplace both flow through fetch-plans.ts. Prevents behavior drift between surfaces, and means any AI-assisted surface gets the same pricing the marketplace shows.
  • Lazy Mongo client. Prevents build-time connection failures on Vercel's static generation — a subtle Next.js + serverless gotcha.
  • State-specific age curves honored. Utah's 1.35–1.45x ratio (legal under 45 CFR 147.102(e)) is whitelisted in the audit. Generic assumptions would have produced 196 false positives.
  • SBE-aware audit filtering. 14 state-based marketplaces don't publish through the federal API; the audit excludes them rather than logging false mismatches.
  • APTC applied at the plan-card layer, not just eligibility. realPrice = max(0, premium − finalAptc) in fetch-plans.ts for every household. Fixed a bug where non-Medicaid subsidized users saw unsubsidized sticker prices.
  • Custom TS-object update CMS. Not a headless CMS. Updates ship at the same velocity as code, version-controlled alongside it. Chosen because update cadence is high-signal.
  • Stack stripped down on purpose. No ORM, no GraphQL, no tRPC, no Redux, no Prisma. Smaller audit surface. Every line of request-handling code is reviewable in a sitting.
  • Data-engineering scripts are first-class. 8.7k LOC of ingest + audit scripts live in the repo and never run in production. Treated as part of the product because data quality is the product.
  • Compliance-aware deferral. The agent portal is deliberately not on current infrastructure. Operating PHI flows on a non-SOC-2 stack now would compromise a future audit, so the team is waiting for AWS migration rather than shipping short-term.
  • No em dashes. Style rule enforced across all user-facing content.

Generated from a Claude Code session with direct access to the repo, commit history, and in-code documentation.

Pager
Previous page2026-04-17 Atlas handoff
Next pageCreative AdBundance proposal brief

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.