Runbook — Break-Glass Root Login

Use this only when the standard SSO path is unavailable. Every break-glass action is logged and reviewed.

When break-glass is authorized

Only these scenarios:

AWS Identity Center / Google Workspace SSO is fully down and a SEV-0 incident requires AWS API access NOW.
MFA device lost AND no recovery codes available AND a time-bound business need exists.
AWS Console outage in the SSO IdP region preventing console login when an action requires console access.

NOT authorized for:

Convenience ("MFA prompt is slow")
Routine work ("I forgot my YubiKey")
Avoiding a permission-set request

Who can break-glass

Today: Taha only. AWS account root credentials live on Taha's password manager. There is no second human break-glass principal until hardware MFA is enrolled for Asad and a parallel root-credential workflow is documented (see risk R-001 + #67).

When a second principal is provisioned, this runbook gets a second row in the "Who can break-glass" table.

What is the break-glass credential

Per AWS account in the org:

Account	Account ID	Break-glass principal	Where the credential lives
Management	778477254880	AWS account root user (`[email protected]`)	Taha's password manager + offline backup
Production	039624954211	AWS account root user (member-account email)	Taha's password manager + offline backup
Staging	549136075525	AWS account root user	Taha's password manager + offline backup
Log-archive	754660694122	AWS account root user	Taha's password manager + offline backup

Each root account has hardware MFA (YubiKey) enrolled (pending #67 — until then, virtual MFA app). Root user has no programmatic access keys ever; console + MFA only.

The MFA codes for root accounts live separately from the passwords — in a different password manager vault, with a different unlock factor.

Procedure — invoke break-glass

Confirm the standard SSO path is unavailable. Try once. Document the symptom in the incident channel.
Notify the other founders (Asad + Ian) by text or call BEFORE invoking. Reasoning: a single principal's emergency action should not be a surprise to the rest of the team.
Sign in to AWS Console with the account root user + password + MFA.

Open agent_audit_log and write a row (via Atlas web shell if needed):

javascript

db.agent_audit_log.insertOne({
  timestamp: new Date(),
  actor_id: "[email protected]",
  actor_role: "break_glass_root",
  action: "break_glass_root_login",
  resource_type: "aws_account",
  resource_id: "<account-id>",
  ip_address: "<current-IP>",
  user_agent: "<browser-UA>",
  result: "success",
  metadata: {
    reason: "<one-line rationale>",
    incident_channel: "<google-chat-link-or-NA>",
    expected_session_duration_minutes: <N>,
  },
});

Perform only the minimum action required. Do not browse. Do not change unrelated settings.
Sign out as soon as the action completes.

After break-glass

Within 24 hours:

Document at docs/session-log/<date>-break-glass-<slug>.md — what was done, what triggered it, what was changed, expected return-to-normal time.
Restore the standard path — provision a new MFA device, restore SSO, etc.
Rotate any credential touched during the break-glass session — defensive even if no compromise is suspected. Includes any AWS access key created (should be zero), any password set, any policy modified.
Inform Compliance Liaison (Asad) — for inclusion in the next quarterly access review as a documented break-glass event.

Post-event review

At the next quarterly access review:

Confirm the break-glass session was the minimum action required.
Confirm the credential rotations actually happened.
Confirm the session-log post-mortem matches the audit-log row + CloudTrail entries.
Decide if the break-glass procedure needs updating based on this exercise.

Anti-patterns (do NOT)

Do NOT use root credentials to create IAM users / access keys / policies for "convenience"
Do NOT use root for routine work even if SSO is up
Do NOT skip the agent_audit_log entry — the audit row is the evidence artifact
Do NOT use root to bypass an in-progress incident response (e.g. "I'll just root-login to look around"). Containment via the IR runbook is the right path.

Reference

Access Control Policy
Security Incident Response runbook
Incident Response Plan
Risk Assessment R-001 — single-principal admin risk
#67 — hardware MFA enrollment tracking
AWS root user best practices: https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html