Appearance
Runbook — Break-Glass Root Login
Use this only when the standard SSO path is unavailable. Every break-glass action is logged and reviewed.
When break-glass is authorized
Only these scenarios:
- AWS Identity Center / Google Workspace SSO is fully down and a SEV-0 incident requires AWS API access NOW.
- MFA device lost AND no recovery codes available AND a time-bound business need exists.
- AWS Console outage in the SSO IdP region preventing console login when an action requires console access.
NOT authorized for:
- Convenience ("MFA prompt is slow")
- Routine work ("I forgot my YubiKey")
- Avoiding a permission-set request
Who can break-glass
Today: Taha only. AWS account root credentials live on Taha's password manager. There is no second human break-glass principal until hardware MFA is enrolled for Asad and a parallel root-credential workflow is documented (see risk R-001 + #67).
When a second principal is provisioned, this runbook gets a second row in the "Who can break-glass" table.
What is the break-glass credential
Per AWS account in the org:
| Account | Account ID | Break-glass principal | Where the credential lives |
|---|---|---|---|
| Management | 778477254880 | AWS account root user ([email protected]) | Taha's password manager + offline backup |
| Production | 039624954211 | AWS account root user (member-account email) | Taha's password manager + offline backup |
| Staging | 549136075525 | AWS account root user | Taha's password manager + offline backup |
| Log-archive | 754660694122 | AWS account root user | Taha's password manager + offline backup |
Each root account has hardware MFA (YubiKey) enrolled (pending #67 — until then, virtual MFA app). Root user has no programmatic access keys ever; console + MFA only.
The MFA codes for root accounts live separately from the passwords — in a different password manager vault, with a different unlock factor.
Procedure — invoke break-glass
- Confirm the standard SSO path is unavailable. Try once. Document the symptom in the incident channel.
- Notify the other founders (Asad + Ian) by text or call BEFORE invoking. Reasoning: a single principal's emergency action should not be a surprise to the rest of the team.
- Sign in to AWS Console with the account root user + password + MFA.
- Open
agent_audit_logand write a row (via Atlas web shell if needed):javascriptdb.agent_audit_log.insertOne({ timestamp: new Date(), actor_id: "[email protected]", actor_role: "break_glass_root", action: "break_glass_root_login", resource_type: "aws_account", resource_id: "<account-id>", ip_address: "<current-IP>", user_agent: "<browser-UA>", result: "success", metadata: { reason: "<one-line rationale>", incident_channel: "<google-chat-link-or-NA>", expected_session_duration_minutes: <N>, }, }); - Perform only the minimum action required. Do not browse. Do not change unrelated settings.
- Sign out as soon as the action completes.
After break-glass
Within 24 hours:
- Document at
docs/session-log/<date>-break-glass-<slug>.md— what was done, what triggered it, what was changed, expected return-to-normal time. - Restore the standard path — provision a new MFA device, restore SSO, etc.
- Rotate any credential touched during the break-glass session — defensive even if no compromise is suspected. Includes any AWS access key created (should be zero), any password set, any policy modified.
- Inform Compliance Liaison (Asad) — for inclusion in the next quarterly access review as a documented break-glass event.
Post-event review
At the next quarterly access review:
- Confirm the break-glass session was the minimum action required.
- Confirm the credential rotations actually happened.
- Confirm the session-log post-mortem matches the audit-log row + CloudTrail entries.
- Decide if the break-glass procedure needs updating based on this exercise.
Anti-patterns (do NOT)
- Do NOT use root credentials to create IAM users / access keys / policies for "convenience"
- Do NOT use root for routine work even if SSO is up
- Do NOT skip the
agent_audit_logentry — the audit row is the evidence artifact - Do NOT use root to bypass an in-progress incident response (e.g. "I'll just root-login to look around"). Containment via the IR runbook is the right path.
Reference
- Access Control Policy
- Security Incident Response runbook
- Incident Response Plan
- Risk Assessment R-001 — single-principal admin risk
- #67 — hardware MFA enrollment tracking
- AWS root user best practices: https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html