Skip to content
AskFlorence
Main Navigation ArchitectureFlorence AIAgentsMembersAgent PlatformValidationInfrastructure

Appearance

Sidebar Navigation

Overview

Home

Glossary

System Architecture

Consumer & Agent Flow

Florence AI

Overview

Principles

Runtime

Tool surface

Adding a tool

Tool registry

Knowledge: SBC scenarios & CSR

Voice

Evals & observability

Provider risk & portability

Outage playbook

Roadmap

Build plan

Agents

Overview

Workflows & pain points

Members

Overview

Medicaid coverage gap

Carriers

Overview

Marketplaces

Overview

Agency

Overview

Regulations

Overview

Agent Platform

Overview

Auth Architecture

MongoDB Permissioning

Compliance Model

Data Models

Data Sources

Overview

CMS Marketplace API

CMS dependency map

PUF Data

State Subsidies

SBE Ingestion Playbook

SBE State Watchouts + Decisions

CA Phase C/D Playbook

NY Phase C/D Playbook

Validation

Overview

Methodology

APTC Formula

California 2026

New York 2026

CAPS Formula

Scenario Results

Infrastructure

Account Inventory

AWS Setup Runbook

AWS Organizations

CloudTrail

GuardDuty

Security Hub

Config

CloudFront + WAFv2

Data sources & ingest

Phase 4 DNS

Change Log

Vulnerability Management

MongoDB Setup

Access Control

Data Classification

Documentation Hosting

Post-deploy Smoke

Development

Preflight (local CI mirror)

Testing strategy

Compliance

Overview (auditor entry point)

SOC 2 Control Mapping

HIPAA Control Mapping

CMS EDE Appendix A Mapping

Risk Assessment

Encryption Policy

Data Retention Policy

Privacy Impact Assessment

Consent Capture & Versioning

Incident Response Plan

Access Control Policy

Marketing vs. Portal Analytics

Vendor / Subprocessor Register

Dependency Vulnerability Policy

BAA / Compliance Evidence

Compliance-Automation Integration

Compliance-Automation Vendor Evaluation

Penetration Test Reports

Architecture

Portal entry handoff

Mobile app strategy

Deferred architecture decisions

Session cookie architecture

Share flows

Decisions (ADRs)

Index

0001 — Atlas project isolation

0002 — Append-only audit log

0003 — Narrow-scoped Mongo users

0004 — Cross-cluster Atlas PrivateLink

0005 — Delayed-job architecture

0006 — Mongo user simplification

0007 — Terraform owns ECS task def

0008 — E2E testing strategy

0009 — Self-hosted analytics + observability (superseded)

0010 — PostHog HIPAA Cloud (supersedes 0009)

Runbooks

Security Incident Response

Break-Glass Root Login

Onboard Team Member

Offboard Team Member

Atlas user provisioning

Deploy via Terraform (ENG-277)

Rollback via Terraform (ENG-277)

S3 data bucket migration (planned Phase 11)

Access Reviews

2026-Q2 Review

Session log

Index

2026-04-23 — Phase 10 DNS cutover

2026-04-22 — Phase 8 prod AWS mirror

2026-04-22 — Phase 7 Atlas VPC peering

2026-04-22 — Phase 6 CloudFront + WAF

2026-04-21 — Phase 5 staging go-live

2026-04-17 — Atlas staging

Briefs

Index

Member portal plan (ENG-187)

2026-04-16/17 handoff

2026-04-17 Atlas handoff

System briefing (2026-04-17)

Creative AdBundance proposal brief

Creative AdBundance analytics brief

ElevenLabs RN integration research

Policies

Overview

On this page

Tier 0.5 - Federal+NY ZIP USPS-completeness audit (2026-05-01) ​

Status: Complete. 4,363 docs inserted across 3 classes. Closes Issue #80 (parent context: Issue #79).

Purpose: Catch the structural blind-spot in Tier 0. Tier 0 used U.S. Census 2020 ZCTA as its universe; Census ZCTA only catalogs ZIPs with significant residential population. PO-Box-only, business-only, single-building, and other USPS-only ZIPs are CMS-recognized but Census-blind. Tier 0.5 closes that gap by using a USPS-derived universe.

Trigger: 2026-05-01 user report - co-founder entered ZIP 85001 (downtown Phoenix) on the prod calculator and got a 404. CMS Marketplace API correctly identifies 85001 as AZ/Maricopa County (Rating Area 4). Tier 0 had no 85001 doc because Census 2020 ZCTA doesn't track PO-Box-only ZIPs.

Summary ​

MetricCount
USPS universe (federal+NY filter, zipcodes npm)24,945 distinct ZIPs
DB before audit (federal+NY clean)20,618 distinct ZIPs (across 30,695 docs)
DB after audit (federal+NY clean)24,965 distinct ZIPs (across 35,058 docs)
Gap zips (USPS \ DB) - Tier 0's blind spot4,347
Insertable gaps inserted3,842 docs (across 3,829 unique ZIPs; 13 multi-county)
Discrepancy docs inserted (cross-jurisdiction)3
Non-residential docs inserted (corporate ZIPs)518
Total Tier 0.5 marker docs4,363
Extras (DB has, USPS-npm doesn't)20 (left untouched - see "Extras" section)
Needs-PUF0 (every CMS-confirmed county had an existing DB sibling for regionId derivation)

_seedSource: "federal-tier-0-5-audit-2026-05-01" on every inserted doc.

What was inserted ​

Class 1 - Insertable (3,842 docs across 3,829 unique ZIPs) ​

Standard federal+NY county doc with regionId derived from existing same-county DB siblings (identical pattern to Tier 0). Each entry was independently CMS-confirmed via https://marketplace.api.healthcare.gov/api/v1/counties/by/zip/{zip} before classification.

The 85001 user-reported case is in this class. Inserted as:

json
{
  "zip": "85001",
  "countyFips": "04013",
  "county": "Maricopa County",
  "state": "AZ",
  "regionId": "Rating Area 4",
  "_seedSource": "federal-tier-0-5-audit-2026-05-01"
}

Verified resolution: curl https://askflorence.health/api/counties?zip=85001 returns {"counties":[{"fips":"04013","name":"Maricopa County","state":"AZ"}]}. End-to-end plan lookup returns 86 plans.

Per-state insertable breakdown:

StateCountStateCountStateCount
AK28KS42OR50
AL143LA181SC110
AR90MI166SD10
AZ100MO118TN149
DE28MS89TX606
FL445MT36UT45
HI40NC226WI94
IA86ND19WV113
IN149NE30WY15
NH34
NY321
OH180
OK99

The 13 multi-county ZIPs (each gets 2 docs, one per county CMS returned) are distributed across states without any anomaly clustering.

Class 2 - Discrepancy (3 docs) ​

ZIPs where USPS classifies the ZIP under a federal-30 state but CMS routes it to a different jurisdiction. Inserted with the existing platform shapes (sbeRedirect for SBE-state routing, unsupported for territory/non-marketplace cases) so the calculator surfaces a meaningful response instead of a 404.

ZIPUSPS stateCMS routes toDoc shapeWhy
45275OHKY (Boone County)sbeRedirect to kynectCincinnati/Northern Kentucky International Airport (CVG); physically in Boone County, KY
45999OHKY (Kenton County)sbeRedirect to kynectIRS Service Center, Covington KY; postal classification points to OH
96898HIMH (Marshall Islands - Kwajalein Atoll)unsupported: us_territory_no_marketplaceUS Army installation under Compact of Free Association; ACA Marketplace not available

Verified resolution:

  • curl ?zip=45275 → {"sbeRedirect":{"state":"KY","marketplace":"kynect (kynect.ky.gov)"}}
  • curl ?zip=96898 → {"unsupported":{"reason":"us_territory_no_marketplace","message":"ACA Marketplace coverage isn't available in the Marshall Islands. If you live or work at the Kwajalein Atoll installation, contact your sponsor's HR or your Tricare benefits administrator for coverage options."},"alternateCounties":[]}

Class 3 - Non-residential (518 docs) ​

Corporate, business-only, or single-recipient mailing ZIPs that USPS recognizes but CMS does NOT (CMS returns {"counties":null} for each). Examples: 10046 NY "Contest Mail", 10072 NY "Philip Morris", 10094 NY "Marden Kane Inc", 10197 NY "Citicorp Services Inc", 19889 DE "Beneficial Natl Bank".

CMS itself rejects these with "invalid zipcode or fips provided" if you POST a plans-search. Healthcare.gov's user flow surfaces "this ZIP isn't recognized" for the same set. Pre-Tier-0.5 our calculator returned bare 404 "Zip code not found".

Inserted shape (uniform across all 518):

json
{
  "zip": "10197",
  "countyFips": "",
  "county": "",
  "state": "NY",
  "regionId": "",
  "unsupported": {
    "reason": "non_residential",
    "message": "This ZIP code is registered as a corporate, business-only, or single-recipient mailing address and isn't used for residential health-insurance lookups. Please enter the ZIP code for the address where you actually live."
  },
  "_seedSource": "federal-tier-0-5-audit-2026-05-01"
}

The route's unsupported branch short-circuits before referencing countyFips/county/regionId, so the empty values are safe.

Why no alternateCounties: unlike PO-Box ZIPs (which serve real residential populations who picked the wrong ZIP), corporate ZIPs have no associated residential population - any "nearest county" computation would invent counties unrelated to where the corporation's employees actually live. CMS doesn't surface alternates for these either; the right UX nudge is "use your home ZIP," not "pick a nearby county."

Per-state non-residential breakdown:

StateCountStateCountStateCount
AK1KS10OR12
AL40LA6SC6
AR7MI12SD9
AZ51MO20TN11
DE2MS17TX65
FL36MT1UT5
HI2NC12WI21
IA9ND1WV6
IN32NE5WY1
NH3
NY67
OH35
OK13

Concentration matches business-density pattern (NY 67, TX 65, AZ 51, AL 40, FL 36 lead).

What was NOT inserted (documented exclusions) ​

Extras (20 ZIPs in DB but not in USPS-npm universe) ​

20135  42223  42602  56144  56136  56219  56220  56164  56257  80737
72405  72713  75036  75072  83342  89421  99362  30555  30559  88240

Two well-understood subclasses:

16 of 20 - Cross-state border ZIPs USPS classifies under SBE host state:

These ZIPs serve residents on BOTH sides of a state line. USPS picks ONE state for postal classification; CMS knows about every county the ZIP covers. Our DB carries multi-county truth. When USPS picks an SBE state for postal classification, the npm-derived universe filters out the ZIP entirely (we filter to federal+NY) - so the federal-side entry in our DB looks "extra" relative to the npm filtered set.

Examples:

  • 20135 Bluemont - USPS=VA (SBE), CMS returns VA-Clarke + WV-Jefferson + VA-Loudoun. DB has all 3.
  • 42223 Fort Campbell military base - USPS=KY (SBE since 2024 kynect), CMS returns KY-Christian + TN-Montgomery. DB has both.
  • 56144 Jasper MN - USPS=MN (SBE), CMS returns MN-Rock + MN-Pipestone + SD-Moody. DB has all 3.
  • 30555 30559 88240 - the 3 Path 1 cross-state border fixes (commit 843bdf7).

4 of 20 - ZIPs missing from zipcodes npm package (data freshness gap):

75036, 72405, 72713, 75072 - all in our DB (sourced from CMS at insert time, fully canonical) but not in the zipcodes npm package. Likely USPS additions after the npm package's last data refresh; high-growth corridors like Frisco/McKinney TX often see new ZIPs.

Action: leave alone. Existing DB data is canonical. The "extra" classification is an artifact of incomplete USPS-npm universe, not a data defect. If we ever upgrade to the HUD ZIP-County crosswalk (richer, refreshed quarterly), Tier 0.5 audits will catch any future cases of this pattern automatically.

Methodology ​

Universe choice ​

We considered two USPS-derived universe sources:

  1. HUD ZIP-County crosswalk - refreshed quarterly, requires HUD account (auth click-through), authoritative.
  2. zipcodes npm package - MIT-licensed, ~44K USPS-derived records, zero auth friction, slightly stale.

Chose zipcodes npm for v1 of Tier 0.5 for zero auth friction. Sanity-gated against 6 known PO-Box-only ZIPs (85001 AZ, 10008 NY, 33101 FL, 78201 TX, 73101 OK, 84101 UT). All 6 present.

The 4 npm-stale extras (75036, 72405, 72713, 75072) are the cost of using a static package vs a quarterly-refreshed crosswalk. Acceptable trade-off for Tier 0.5 v1 since they don't introduce inconsistency (CMS knows about them too; our DB is already correct for them). HUD upgrade is the right next step for ongoing refresh cadence.

Structural difference from Tier 0 ​

Tier 0 universe was Census ZCTA, which carries (zip, state, countyFips) tuples natively. Universe membership and DB membership were both checked at the (zip, countyFips) tuple level.

Tier 0.5 universe is (zip, state) only - no countyFips. So gap detection is at the ZIP level, and CMS becomes the authoritative source for (state, countyFips, county) assignment per gap ZIP. This is more correct for the Tier 0.5 thesis (CMS uses USPS data to assign counties, so CMS will agree with USPS on what ZIPs exist).

Pipeline ​

  1. scripts/db/build-usps-snapshot.js - filter zipcodes npm to federal-30 + NY, emit data/usps-zip-state-2026-05-01.csv (24,945 ZIPs).
  2. scripts/db/audit-federal-completeness-tier-0-5.js - load universe, load DB state, set-difference at zip-level, query CMS for each gap zip, classify each (zip, countyFips) CMS returns into insertable / needs-PUF / discrepancy / cms-error.
  3. scripts/db/retry-cms-errors-tier-0-5.js - retry pass for HTTP 429s at lower concurrency with exponential backoff, then re-classify (initial run at concurrency=10 hit CMS rate limits; retry at concurrency=3 with backoff cleared 2,064 of 2,352 transient failures).
  4. scripts/db/seed-federal-tier-0-5.js - apply user-greenlighted batches with --state/--class filters, three-mode CLI (--dry-run / --apply / --rollback), idempotency guard, marker tagging.

Constraints honored ​

Per the session plan:

Constraint 1 - PROD BACKUP BEFORE EVERY APPLY. Every batch was preceded by a fresh mongodump of the entire zip_county collection, verified by file size + record count + sha256 + sample round-trip. Three backups taken across three batches. Stored at ~/Documents/askflorence-db-backups/zip_county/<TAG>/ (local-only - S3 sync blocked by bucket policy on the SSO admin role; flagged as separate ops follow-up).

BatchBackup tagPre-apply countsha256
1: AZ insertable (100 docs)pre-tier-0-5-batch-az-insertable-20260501T220646Z48,232bd71519d...
2: discrepancy (3 docs)pre-tier-0-5-batch-discrepancy-20260501T222700Z48,3327801459d...
3: bulk-remaining (4,260 docs)pre-tier-0-5-batch-bulk-remaining-20260501T222753Z48,335aa81da42...

Constraint 2 - ANALYZE → REPORT → PHASED USER DECISIONS, never auto-update. No write occurred without an explicit user greenlight. Three batches applied as: AZ insertable (smoke + 85001 quick-win) → discrepancy (validated new sbeRedirect + territory shapes) → bulk-remaining (rest of insertable + non_residential).

Verification - TRUE 100% match achieved ​

GateResult
85001 prod live APIReturns {"counties":[{"fips":"04013","name":"Maricopa County","state":"AZ"}]} ✓
85001 plan lookup end-to-end86 plans returned (Catastrophic Standard 68445AZ0590050 $338.26/mo first) ✓
50613 prod live APIReturns 4 counties (Black Hawk + Bremer + Butler + Grundy) ✓
Calculator baseline diff (12 scenarios)ZERO DIFFS post-batch-1 + post-batch-3 ✓
Tier 0.5 re-run0 gap zips remaining (was 4,347) ✓
Tier 1 audit (federal zip-county)22,302/22,302 = 100.00% exact match ✓ (after audit-script patch + 50613 fix + rate-limit retry validation)
Tier 1.5 audit (SBE zip-county)13,055/13,055 = 100.00% exact match ✓ (after rate-limit retry validation)
Smoke matrix on 10 inserted ZIPs across multiple states10/10 return correct counties ✓
Smoke matrix on 5 non_residential ZIPs5/5 return generic unsupported message ✓
Smoke matrix on 3 discrepancy ZIPs3/3 return correct sbeRedirect/unsupported shapes ✓
Per-state DB count vs audit predictionExact match across all 31 federal+NY states + KY (2) + MH (1) ✓

Path to TRUE 100% (Phase 8b drive-to-100% effort) ​

Initial post-apply audits showed Tier 1 = 99.84% (2 mismatches + 33 rate-limit errors) and Tier 1.5 = 99.80% (0 mismatches + 26 rate-limit errors). Three issues separated:

Issue A: 33+26 CMS rate-limit errors in initial audits (UNKNOWN status). Built scripts/audit/validate-cms-errors.js to retry each at concurrency=1 with exponential backoff (5s/10s/20s/40s/80s) and re-classify via the same DB-vs-CMS comparison the original audit does. Result: 33/33 Tier 1 retries = MATCH; 26/26 Tier 1.5 retries = MATCH. No real mismatches were hiding behind rate limits.

Issue B: ZIP 96898 Marshall Islands - audit-script false positive. The Tier 1 script's $match filter excluded our MH/Kwajalein unsupported-class doc from the "ours" comparison but didn't subtract the corresponding (zip, fips) tuple from the CMS-side, so it surfaced as "extra in CMS / county-count mismatch." Patched the audit script to pre-fetch all (zip, fips) tuples from unsupported-class or non-federal-state docs and subtract them from the CMS comparison. Re-ran patched audit; 96898 now exact match. Patch is permanent in the audit script for future runs.

Issue C: ZIP 50613 IA - real data gap (missing Bremer County). This was a tuple-level multi-county completeness gap that Tier 0.5's zip-level gap detection didn't catch. Validated via 5x CMS lookups (5/5 returned Bremer = consistent), regionMap availability (13 existing IA/19017 sibling docs all regionId: "Rating Area 7"). Built scripts/db/fix-tier-1-completeness-gaps.js (dedicated _seedSource: "tier-1-completeness-fix-2026-05-01" marker for surgical rollback), took fresh backup (pre-tier-1-completeness-fix-50613-20260501T231959Z), applied 1 doc. Smoke test: prod /api/counties?zip=50613 returns all 4 counties.

After A + B + C: re-ran patched Tier 1 fresh (cleared progress cache) → 22,302/22,302 = 100.00% exact match, 0 mismatches, 0 extras, 1 transient rate-limit error (validated as MATCH on retry).

Refresh cadence ​

Annual refresh playbook (extends Tier 0's):

  1. Update zipcodes npm package - npm update zipcodes to pull the latest USPS data refresh.
  2. Rebuild USPS snapshot - node scripts/db/build-usps-snapshot.js (verify sanity gate).
  3. Re-run Tier 0.5 audit - MONGODB_URI=<prod-read> CMS_API_KEY=<key> node scripts/db/audit-federal-completeness-tier-0-5.js.
  4. Retry rate-limit pass if needed - MONGODB_URI=... CMS_API_KEY=... node scripts/db/retry-cms-errors-tier-0-5.js.
  5. Triage report per Constraint 2 (per-state, per-class breakdown, anomaly flags).
  6. Phased apply with backups per Constraint 1.
  7. Append change-log entry with timestamp + commit SHA + counts.

Recommended upgrade path: swap zipcodes npm for HUD ZIP-County crosswalk (https://www.huduser.gov/portal/datasets/usps_crosswalk.html) before next plan-year refresh. Quarterly refresh + richer county-fips data + catches the 4 npm-stale extras automatically. Free; requires HUD account.

Limitations + follow-ups ​

  • Tier 0.5b - tuple-level multi-county completeness audit. ZIP 50613 was the only one surfaced by Tier 1 here, but the underlying audit shape (zip-level gap detection rather than tuple-level) means there could be other ZIPs in DB with missing multi-county tuples that Tier 1 just happened to score as 22,302/22,302 because none of those particular ZIPs got audited (or all of their additional CMS counties happen to be already in DB). A defensive follow-up audit could iterate every ZIP in DB, query CMS, and insert any missing (zip, countyFips) tuples - same machinery as Tier 0.5 itself but tuple-level instead of zip-level. Tier 1's clean state today is the empirical floor; the systematic check would be the ceiling.
  • zipcodes npm staleness: 4 ZIPs in our DB (75036, 72405, 72713, 75072) aren't in the npm package. Already correct in DB; HUD upgrade fixes the audit-side completeness.
  • Local-only backups: S3 sync to s3://askflorence-data/db-backups/ blocked because the SSO admin role lacks bucket-policy access (correct prod hardening - ECS task role + GitHub OIDC role have access, human admin doesn't). Separate ops issue needed for either a scoped bucket-policy allow or a dedicated assumable backup-role. Until then: ~/Documents/askflorence-db-backups/zip_county/ is the canonical location.
  • non_residential message tweak: consider improving the route-handler's bare 404 "Zip code not found" message even for ZIPs not in DB at all (e.g., user typo) - "ZIP not recognized; check the digits and try your home address" reads better than "Zip code not found." Frontend-side change, separate from Tier 0.5 scope.

Cross-references ​

  • Issue #80 - execution tracker (closes when this doc lands)
  • Issue #79 - parent context (Tier 0.5 gap class scoping)
  • Tier 0 audit doc - precursor Census-derived audit
  • Commit 7b716d0 - Phase 5 dry-run audit + scripts
  • Commit 749a13d - seed script
  • ~/.claude/plans/users-tahaabbasi-developer-ask-florence-fizzy-gray.md - session plan
Pager
Next pageHome

AskFlorence Internal Documentation. Not for public distribution.

AskFlorence

Internal Documentation

Access restricted. Not for public distribution.