AskFlorence — System Briefing

Source: Claude Code session, generated from repo state, commit history, and in-code docs. Repo: github.com/askflorencehealth/ask-florence · Live: askflorence.health · Version: v0.3.2 · As of 2026-04-17

AskFlorence is an AI-assisted healthtech platform for the ACA individual market. Deterministic application logic does the mechanical heavy lifting (pricing, eligibility, subsidy math, plan filtering), and AI is positioned to do what it is actually best at: guiding users through an opaque, jargon-heavy product, educating them in plain language, and acting like a knowledgeable agent sitting next to them. Today the deterministic layer is production-grade and 100% audit-matched against CMS; the AI-assist layer is being built on top of that foundation rather than instead of it.

1. System Overview

Single Next.js 16.2.3 App Router application (React 19.2.4, Turbopack) on Vercel. Manual deploys via vercel --prod — no auto-deploy, so every production push is intentional.
Data plane: MongoDB Atlas M10, HIPAA tier. Collections: plans, zip_county, regions, counties, waitlist.
External integrations: CMS Marketplace API (eligibility), CMS QRS API (cached ratings), Resend (transactional email), PostHog (analytics server + client).
~27,244 LOC across 105 source files, 150 commits, 10 active development days (2026-04-06 → 2026-04-17), 582 files changed, 51,037 insertions.
AI healthtech positioning: the platform is designed so AI augments a deterministic, audit-validated core. Pricing and eligibility are math, not guesses; AI is reserved for the layer where humans struggle — understanding what they're buying and why.

2. Architecture + Components

User-facing flows (shipped)

Homepage calculator (/) — zip + household → eligibility → subsidized plan cards with before/after pricing.
Plans marketplace (/plans) — full browse experience, server-side filtered, 599-LOC marketplace component. Filters by metal, issuer, plan type, deductible, MOOP, star rating. Shares one data pipeline with the homepage.
Plan cards — show subsidized premium, CSR-adjusted deductible/MOOP (individual vs family, household-size-aware), QRS star rating, plan ID, issuer, SBC links.
Eligibility — federal states proxy CMS Marketplace API; NY calculated fully internally from DFS data.
Updates stream (/updates/[slug]) — engineering and product updates as SSG content. Current entries: audit scorecard, "The Week We Matched CMS Byte for Byte," Infrastructure Is Production, Momentum Day, Brand Team Numbers.
Agent landing (/agents) — marketing page for licensed agents and agencies.
Agent onboarding waitlist (/agent-onboarding) — 6-field waitlist form (the real portal is Phase 5).

API routes

/api/plans — household plan search, CSR tier derivation, APTC-applied pricing.
/api/eligibility — CMS proxy for federal states, internal calculation for NY.
/api/counties — zip → county resolution from our DB.
/api/waitlist — dual-purpose consumer + agent signup; writes Mongo, sends Resend email, emits PostHog events.

Core library

fetch-plans.ts — the single pipeline every plan query funnels through. Computes realPrice = max(0, premium − finalAptc) unconditionally.
owned-plans.ts (1,006 LOC) — Mongo-doc → CMS-shape transform. searchOwnedPlans() serves any of 31 owned-data states with identical output shape as the CMS API.
csr.ts — FPL, Medicaid threshold, CSR tier derivation.
utils.ts — normalizePlan, extractCostShare, findSlcspPremium, calculateAptc.
constants.ts — STATE_BASED_MARKETPLACES authoritative list.
db.ts — lazy Mongo client (prevents build-time connect failures on Vercel).

Content system

Custom TypeScript-object-based CMS for /updates (not third-party). Entries live in src/lib/updates/index.ts, rendered via SSG. Markdown-like HTML inline. Chosen over a headless CMS because update cadence is high-signal, low-volume, and needs to ship at the same velocity as code.

Stack intentionally stripped down

No ORM, no GraphQL, no tRPC, no Redux, no Prisma. Just Mongo driver + Next route handlers + TypeScript. Reduces audit surface, reduces dependency risk, keeps every line of request-handling code reviewable.

3. What Has Been Built (Concrete)

Eligibility engine — federal (CMS proxy) + NY (internal). FPL, Medicaid thresholds, CSR tiers, APTC all derived server-side.
Plan search / pricing — 30 federal states + NY served from our own MongoDB. Age-rated federal premiums with per-state age curves; community-rated NY. 31 states total.
CSR cost-share logic — per-plan puf.csrVariants["94"|"87"|"73"|"zero"|"limited"]. Plan card shows before/after deductible + MOOP based on household FPL.
Household-aware display — PlanCard distinguishes individual vs family MOOP/deductible.
QRS star ratings — CMS Quality Rating System (global, clinical, enrollee, efficiency), 98.1% plan coverage, ingested via scripts/db/ingest-qrs-ratings.js with a JSON cache.
PUF augment pipeline — scripts/db/ingest-puf-augment.js reads CMS PUF CSVs and writes a rich puf.* sub-document per plan: CSR variants, URLs (SBC, formulary, brochure), SBC scenarios (Having a Baby, Diabetes, Simple Fracture), plan features, all MOOP/deductible variants, full per-benefit cost-sharing table, network and formulary foreign keys.
Waitlist system — consumer + agent flows. Agent form captures role toggle (individual / agency), team size, 6-10 digit NPN validation, phone, email. Writes to waitlist, notifies [email protected].
Marketplace browse — /plans page with full filter/sort.
Brand system — full SVG + PNG logo set, per-route dynamic OG images, favicon with approved lantern mark.
Docs site — docs.askflorence.health (VitePress, email-verification gated). Architecture, security/compliance, data sources, glossary.
SOC 2 groundwork — account inventory and vulnerability management policy committed; proactive CVE tracking (Next.js 16.2.2 → 16.2.3 bump for CVE-2026-23869).

4. Data Accuracy Harness — The Unusual Part

Five-tier audit system under scripts/audit/ that validates our database against CMS reality. This is what makes the platform viable for AI augmentation — if the deterministic floor wasn't perfect, AI layered on top would amplify errors.

Tier 1 (tier-1-zip-county.js) — 18,785 zips, resolves zip → county vs CMS. 100% match, zero mismatches, zero extras.
Tier 2 (tier-2-zip-premium.js) — 28,496 zips, cheapest-Silver-per-zip vs CMS. 100% match. The headline audit.
Tier 3 (tier-3-scenarios.js) — 10 household scenarios × 199 counties = 1,990 probes. 100%.
Tier 4 (tier-4-integrity.js) — internal consistency, no CMS calls. Honors state-specific age curves (e.g. Utah's 1.35–1.45x ratio, legal under 45 CFR 147.102(e)). Zero warnings.
Tier 5 (tier-5-plan-ra-sweep.js) — every plan × rating area = 3,357 comparisons. 100%.
parity-check.js — 15-check shape and behavior guard (CI-style pre-deploy guard).
Shared helpers in scripts/audit/lib/db-helpers.js — authoritative federal vs SBE state sets, separate read/write DB URIs.

5. In Progress / Actively Being Built

Distinction matters. The compliance and agent-portal plan is designed in detail in project docs but only the foundations are in code.

Shipped

/agents marketing page
/agent-onboarding waitlist stub

Designed, not yet implemented

Agent auth — two tiers. Tier 1 (onboarding, survey) = email magic link. Tier 2 (post-activation) = magic link + TOTP via authenticator app. 15-min idle / 8-hr absolute session limits. No SMS (NIST 800-63B compliant).
NIPR PDB NPN validation — $1.30 per call at onboarding + monthly alert subscription for actives.
ID verification — vendor-adapter pattern (src/lib/agent-identity.ts), vendor TBD (Persona / Stripe Identity / Plaid / Veriff).
Super-admin path (/sa-login) — argon2id password + TOTP + IP allowlist (Tailscale). Three factor classes for Taha's admin management.
admins collection — DB-backed roles: super_admin / admin / support.
agent_audit_log — append-only, 6–10 year retention (HIPAA/EDE safe).
Consent sub-document — every email-capturing record versioned with statement text, IP, userAgent, opt-ins.
Narrow-scoped Mongo users — app_writer_survey, app_writer_plans, app_writer_agents, app_admin_agents, audit_reader. Today: two users (read + broad-write), broad-write to be deleted pre-production.
AWS migration — blocks the whole agent portal because SOC 2 / HIPAA / CMS EDE audits look back at months of operating history, so building PHI flows on current infrastructure would compromise a future audit.
/privacy and /terms pages — block the discovery survey.
Unsubscribe flow — CAN-SPAM prerequisite.
/agent-discovery — 11-screen research survey (Phase 2).
Drug formulary UI (Phase C) and provider directory UI (Phase D) — data already ingested (puf.formularyId, puf.networkId), UI not built.
Plan detail page (Phase E) — all puf.* data in MongoDB, UI not built.
Member portal, admin portal, enrollment (EDE) flow — not yet in scope; today we show plans, we do not enroll.

6. Data Complexity

CMS PUF (Public Use Files) — multi-GB CSVs: plan-attributes-puf.csv, benefits-and-cost-sharing-puf.csv, rate-PUF. Per-plan, per-variant, per-benefit rows. Normalized into puf.* sub-document.
CMS Marketplace API — live eligibility checks.
CMS QRS API — quality ratings, cached to local JSON.
NY DFS — community-rated rates (loaded via load-ny-2026.js).
Geo data — zip → county → rating area, loaded and enriched via load-zip-county.js + enrich-zip-counties.js.
Challenges solved in scripts/db/: tobacco rates back-filled; stale zip entries marked sbeRedirect or unsupported (not deleted, preserving history); per-zip Alaska rating-area split; 5 county RA mapping corrections; Cigna NC RA15 rates rebuilt from CMS; graceful handling for 16 stale zip_county entries.

7. AI / Agentic Architecture

Philosophy: AI where it adds the most value

ACA pricing and eligibility have mechanical ground truth. CMS publishes age curves, family-aggregation rules, APTC formulas, SLCSP benchmarks. A deterministic implementation can be audited to 100% parity (and has been — see Section 4). A language model layered over that math would be strictly worse: slower, more expensive, non-reproducible, and harder to defend in a CMS audit.

So the platform is split cleanly:

Deterministic core does the math — eligibility, subsidies, plan filtering, pricing, cost-share calculations. This is the part the user must be able to trust.
AI will do the human-shaped work — guiding, explaining, demystifying, teaching, reassuring. This is the part users actually need help with.

Where AI is planned (the highest-leverage slots)

Guided plan picking — conversational interface that asks the user about their life (do you have a regular doctor, do you take any meds, how often do you use the ER) and translates their answers into deterministic filters. The AI doesn't pick the plan; it helps the user articulate what matters to them, then hands the machine-readable filter set to the deterministic engine.
Plain-language plan explanations — every field on a plan card (deductible, MOOP, coinsurance, network tier, formulary tier) is jargon. AI rewrites the plan's cost structure in terms of the user's actual situation ("for a doctor visit, you pay $25 until your deductible — which is $3,000 — is hit").
Benefit and SBC summarization — puf.benefitDetails[] and puf.sbcScenarios are already ingested per plan. AI will turn these into "what this plan covers for someone like you."
Formulary and network lookups — "is my doctor in this plan's network" and "is my medication covered" as natural-language queries against the (Phase C/D) structured data, with the AI as the interface and the structured data as the source of truth.
Agent-like experience — the product promise is "as if an agent is there for you." AI makes that feasible at scale without compromising accuracy, because the numbers always come from the audited engine.
Agent discovery survey synthesis (Phase 2) — unstructured qualitative responses from licensed agents, summarized for the ops team.

Where AI is intentionally NOT used

Pricing. Subsidy math. APTC calculation. SLCSP lookup. CSR tier derivation. Eligibility determination. Plan filtering. Any number a user sees on a plan card.
Hallucination risk on these fields is unacceptable — users make five-figure financial decisions based on them, and a CMS EDE audit will reconstruct every calculation.

Cost-optimization posture

All structured data (plans, benefits, formularies, networks) is already in MongoDB with rich puf.* sub-documents. AI will query this structured data, not re-derive it from raw PUF or CMS APIs at request time.
Plan-level facts (premium, deductible, MOOP, copays, star rating, SBC URLs) are fetched once from our DB. AI sees pre-computed facts, not a prompt stuffed with PUF CSV rows. This caps tokens per interaction.
Deterministic routing: the app decides when AI is even needed. A user who types a zip and a household size goes straight through the math path with zero LLM calls. AI fires only when the user asks a natural-language question or opts into the guided flow.
Cache-friendly prompt structure is planned so per-plan explanations can be generated once per plan-version and reused across users.

AI in the build process today (AI-assisted workflows)

The codebase itself is built with Claude Code as the implementation partner. Single developer + AI produced ~27k LOC in 10 days, including the full audit harness that validates against CMS.
Data-pipeline fixes (Utah age curve, SBE filtering, Cigna NC rate rebuild, county RA corrections) were found and authored in tight human + AI iteration loops.
Internal workflows (update authoring, brand asset generation, OG image templates, commit hygiene) lean on AI heavily; user-facing runtime does not.

8. Security + Compliance Design

Today

MongoDB Atlas M10 HIPAA tier, TLS-only, secrets in Vercel env.
Two Mongo users (read + broad-write). Broad-write marked for deletion before production launch.
No PHI stored today. Waitlist captures PII (email, name, phone, NPN), not PHI.
SOC 2 account inventory and vulnerability management policy checked in.
Proactive CVE tracking (Next.js 16.2.2 → 16.2.3 for CVE-2026-23869).
Per-route OG metadata overrides to prevent default-image leakage.

Designed for the agent platform

SOC 2 / HIPAA / CMS EDE audit-readiness from day one on AWS. EDE audits look back at months, not just audit day.
Append-only agent_audit_log — 6–10 year retention.
Tier 1 / Tier 2 session policy — 15-min idle / 8-hr absolute.
Vendor BAA matrix — MongoDB Atlas, Resend, AWS, NIPR, ID verify vendor, Cloudflare, PostHog.
Consent sub-document versioning on every email-capturing record, so future CRM imports are GDPR-compliant by construction.
Duplicate NPN handling — magic link goes to the original email on file, never the submitted email. Rate-limited, audit-logged.

PCI

Out of scope today. We do not take payment — carriers pay us PMPM out-of-band. No card handling code in the repo.

9. Build Scope + Speed

Start: 2026-04-06 (initial commit).
Today: 2026-04-17.
12 calendar days, 11 active dev days.
150 commits, 582 file-changes, 51,037 insertions, 5,446 deletions.
~27,244 LOC across 105 TS/TSX/JS source files, plus ~1k LOC of in-repo docs.
Single developer (Taha), working with Claude Code as the implementation partner.
Visible codebase progression in the commit log: Day 1 = CMS API demo → Day 3 = MongoDB Phase 1 → Day 7 = serve all 31 states from own DB → Day 9 = 100% audit match → Day 10 = agent platform Phase 1 shipped.

10. Unique / Non-Obvious Decisions

Deterministic floor, AI ceiling. The unusual call: build the audit-validated math engine first, then layer AI on top only where it adds irreplaceable value. Most "AI-first" healthtech goes the other way and ends up defending accuracy problems forever.
Shape parity with CMS. mongoDocToCmsPlan transforms our Mongo docs into the exact CMS API Plan shape. The frontend can't tell where a plan came from. That is what made the 30-states-from-own-DB migration invisible to the UI, and it's what will make AI features portable across data sources.
One pipeline for all plan queries. Homepage calculator and /plans marketplace both flow through fetch-plans.ts. Prevents behavior drift between surfaces, and means any AI-assisted surface gets the same pricing the marketplace shows.
Lazy Mongo client. Prevents build-time connection failures on Vercel's static generation — a subtle Next.js + serverless gotcha.
State-specific age curves honored. Utah's 1.35–1.45x ratio (legal under 45 CFR 147.102(e)) is whitelisted in the audit. Generic assumptions would have produced 196 false positives.
SBE-aware audit filtering. 14 state-based marketplaces don't publish through the federal API; the audit excludes them rather than logging false mismatches.
APTC applied at the plan-card layer, not just eligibility. realPrice = max(0, premium − finalAptc) in fetch-plans.ts for every household. Fixed a bug where non-Medicaid subsidized users saw unsubsidized sticker prices.
Custom TS-object update CMS. Not a headless CMS. Updates ship at the same velocity as code, version-controlled alongside it. Chosen because update cadence is high-signal.
Stack stripped down on purpose. No ORM, no GraphQL, no tRPC, no Redux, no Prisma. Smaller audit surface. Every line of request-handling code is reviewable in a sitting.
Data-engineering scripts are first-class. 8.7k LOC of ingest + audit scripts live in the repo and never run in production. Treated as part of the product because data quality is the product.
Compliance-aware deferral. The agent portal is deliberately not on current infrastructure. Operating PHI flows on a non-SOC-2 stack now would compromise a future audit, so the team is waiting for AWS migration rather than shipping short-term.
No em dashes. Style rule enforced across all user-facing content.

Generated from a Claude Code session with direct access to the repo, commit history, and in-code documentation.