Skip to content

fix: ignore modern static-asset extensions#17

Open
marcin-prerender wants to merge 1 commit into
masterfrom
fix/ignore-woff2-and-modern-static-assets
Open

fix: ignore modern static-asset extensions#17
marcin-prerender wants to merge 1 commit into
masterfrom
fix/ignore-woff2-and-modern-static-assets

Conversation

@marcin-prerender

Copy link
Copy Markdown

The bug

The static-asset filter in the RewriteCond %{REQUEST_URI} negative lookahead is missing modern asset extensions. When a crawler requests a font or modern image format, the request is forwarded to the Prerender rendering service, which rejects static assets with a 504 — so the crawler sees a 5xx on an asset that serves 200 for normal users.

Live evidence

Verified against prerender.io's own site: Googlebot UA requesting inter-*.woff2 receives 504, while a normal UA receives 200. Any forwarded font/image request surfaces as a crawler-visible 5xx.

Canonical reference

This propagates the canonical static-asset ignore list from prerender/integration-contract#1, which adds .woff2 .otf .eot .webp .avif .webmanifest.

Changes (.htaccess + the same snippet in README.md)

  • Added \.woff2, \.otf, \.eot, \.webp, \.avif, \.webmanifest to the extension alternation.
  • Deduplicated \.doc, which appeared twice in the list.
  • Added the [NC] flag to the REQUEST_URI condition — the user-agent condition above already uses [NC], but this one didn't, so uppercase URLs (e.g. /FONT.WOFF) bypassed the filter and were forwarded to the rendering service.

Caveats

  • The lookahead is unanchored substring matching, so \.woff already incidentally covered .woff2 and \.avi covered .avif; the new entries make the intent explicit and match the canonical contract list. Anchoring (and the resulting over-matching trade-off, e.g. \.doc matching .docx paths) is out of scope here.
  • No test suite in this repo; change validated by visual inspection of the rewrite rules.

🤖 Generated with Claude Code

Add .woff2, .otf, .eot, .webp, .avif, and .webmanifest to the
static-asset ignore list in the RewriteCond, per the canonical list in
prerender/integration-contract#1. Also deduplicate the repeated \.doc
entry and add the [NC] flag so uppercase URLs (e.g. /FONT.WOFF) no
longer bypass the filter. Forwarding these asset requests to the
rendering service surfaces crawler-visible 504s.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants