Premise.
The new site shipped last week. Before linking to it from anywhere that matters, I wanted to audit it the way I’d audit a real tenant onboarding into a security-first org. A site I’m pointing recruiters at is the worst place to have a CSP-shaped hole or a long-lived admin key hanging around.
Findings.
Read-only pass first. Three HIGHs and a MED that needed fixing now:
- Mgmt admin user with a long-lived key, still active. Last used six weeks earlier. Attached to S3, Route 53, CloudFront, ACM — full reach over every surface the new stack produced.
- CSP and Permissions-Policy missing. The AWS-managed
SecurityHeadersPolicy's fields for both are empty objects. HSTS withoutincludeSubDomainsorpreload. - Orphaned CloudFront Functions. Published to LIVE, zero behavior associations. IaC said one thing, the live distribution said another.
- Spacelift trust roles with a wildcard ExternalId.
StringLike castillo-a@*instead of integration-scoped.
The pivot.
The audit offered two paths for the missing headers: (a) attach the existing CloudFront Functions, or (b) replace the managed Response Headers Policy with a custom one. (a) looked cheap. It wasn’t.
The same cache behavior already runs a Lambda@Edge for pretty-URL rewriting. AWS’s rule: per cache behavior, you get one CFF or one L@E per event — never both. So path (a) really meant either moving the URL rewrite into a CF Function (refactor) or splitting headers off into a second behavior (more surface to maintain).
Path (b) — a Response Headers Policy — applies at the response layer, orthogonal to both CFF and L@E. One Terraform resource attached to the existing behavior. Done.
One Terraform resource —
aws_cloudfront_response_headers_policy.security_headers — attached
to the existing behavior. After the Spacelift run, the live site returns:
strict-transport-security: max-age=63072000; includeSubDomains; preload
content-security-policy: default-src 'self'; script-src 'self'; ...
permissions-policy: camera=(), microphone=(), geolocation=(), payment=()
x-frame-options: DENY
cross-origin-opener-policy: same-origin
referrer-policy: strict-origin-when-cross-origin The orphan CF Function stays provisioned with an inline comment explaining why: if a future behavior is added that doesn’t need L@E, the Function is ready. The comment is what keeps the next person from deleting it as drift.
The point worth writing down isn’t the header policy itself. It’s that the cheap-looking option and the right option were structurally different — and recognizing which is which only after you take the AWS constraint seriously is the call worth being able to make.
The other two fixes were less interesting. The admin user got the
standard deactivate → delete-key → detach-policies → delete-user treatment after CloudTrail showed exactly one event on its key in 60 days
(a GetCallerIdentity probe). The Spacelift trust role got an iam update-assume-role-policy narrowing the ExternalId from StringLike castillo-a@* to StringLike castillo-a@<integrationId>@*, verified by
a live Spacelift run completing through the tightened trust.
What’s tracked as follow-up.
A few things stayed open on purpose:
- Cloudflare API token has no expiry; needs reissue with a scoped, expiring token.
- Terraform Cloud admin has no 2FA.
- Expired SAML signing certs in AWS IIC and the Entra app, kept until the new ones are proven durable.
- Disabled legacy CloudFront distribution + wildcard ACM cert — both vestigial, behind a cascading cleanup task.
- Failover S3 buckets have no access logging; log buckets have no lifecycle policy.
Real ops never finishes. You just decide what to track, and you make sure the next person — or the next agent — has enough context to keep going. The point of the audit wasn’t to close every finding; it was to make sure I knew what was open.