AEM Dispatcher: The Complete Guide (Basics to Advanced)
A deep, practical guide to the AEM Dispatcher for experienced developers — request flow, filters, cache rules, rewrite rules, virtual hosts, farm files, invalidation and flush agents, security hardening, AEM as a Cloud Service, and a step-by-step cache-troubleshooting workflow, with best practices, do's & don'ts, and a copy-paste cheat sheet.

The Dispatcher is the most misunderstood part of an AEM stack. At a glance it looks like "just an Apache config," so it's easy to treat it as an afterthought — and that's exactly how production incidents happen. In reality the Dispatcher is three critical things at once: your cache, your load balancer, and, most importantly, your security perimeter. Misconfigure the caching side and you cripple publish performance; misconfigure the security side and you expose your repository to the open internet.
This guide is written for developers who already know their way around AEM and want to move from "I can copy a filter rule from Stack Overflow" to "I understand precisely how a request flows through the Dispatcher, why something is or isn't cached, and whether my configuration is safe." It covers both worlds — classic AMS / on-prem and AEM as a Cloud Service (AEMaaCS) — and ends with best practices, common pitfalls, and a cheat sheet you can keep open while you work.
Try it as you read: I built a free AEM Dispatcher Tester & Security Audit that runs entirely in your browser. Paste your filter and cache rules, test any URL, run a batch suite of expected outcomes, or audit your config against the security checklist below.
What the Dispatcher actually is
The Dispatcher is a module — mod_dispatcher — that plugs into the Apache HTTP Server. (On AEMaaCS it runs inside an Adobe-managed container, but it's the same module.) It sits in front of your AEM publish instances, and occasionally in front of author, performing three distinct jobs:
- Caching — it stores rendered pages and assets as files on disk, so the vast majority of requests are served by Apache without ever touching AEM.
- Load balancing — it distributes requests across multiple publish instances (called renders), with failover and session stickiness when needed.
- Security filtering — it decides which URLs are even allowed to reach AEM in the first place.
The mental model worth holding onto is this: AEM renders, and the Dispatcher protects and accelerates. When a request can be answered from the Dispatcher cache, none of your Java code runs at all — and that, in one sentence, is the entire performance story behind a fast AEM site.
What it is not
It's just as useful to be clear about what the Dispatcher doesn't do, because each of these is a common misconception:
- It is not a CDN. On AEMaaCS there is a separate CDN sitting in front of the Dispatcher, but the Dispatcher itself is an origin-side cache, not an edge network.
- It is not a full web application firewall. It filters by URL, method, and extension — it does not inspect request payloads for attacks the way a WAF does.
- It does not understand your content model. It only ever sees HTTP requests and file paths; it has no idea what a "page" or a "component" is.
Where it sits: the request flow
If you remember one thing about the Dispatcher, make it the order in which it processes a request — almost every confusing behavior traces back to this sequence:
Browser
│
▼
[ CDN ] ← AEMaaCS only (Adobe-managed Fastly)
│
▼
[ Apache HTTP Server ]
│ 1. Virtual host match → which site / farm?
│ 2. mod_rewrite → vanity URLs, redirects, normalization
│ 3. mod_dispatcher
│ a. /filter → ALLOW or DENY the request (security)
│ b. /cache → serve from cache, or fetch + store
│ c. /renders → load-balance to a publish instance
▼
[ AEM Publish ] → only reached on a cache miss
Two consequences fall directly out of this ordering, and both have real design implications:
- Filters run before the cache. A request that the filter denies is never cached and never forwarded to publish — security is enforced first, which is exactly what you want.
- Rewrites run before filters. Your
mod_rewriterules can change the URL before the filter ever sees it, so a vanity URL and the filter rule that should allow it must be designed together — otherwise a rewrite can quietly send a request to a path the filter then blocks.
Two worlds: AMS/on-prem vs AEM as a Cloud Service
The Dispatcher's concepts are identical across deployment models, but how you package, deploy, and operate it differs significantly. The first thing to establish on any project is which world you're in:
| Aspect | AMS / On-prem | AEM as a Cloud Service |
|---|---|---|
| Config file(s) | A dispatcher.any tree you own | The Dispatcher SDK project (conf.d + conf.dispatcher.d) |
| Deploy | You manage Apache + module | Built into the release pipeline; validated on deploy |
| Cache invalidation | Flush replication agents + /allowedClients | Mostly TTL + CDN; explicit flush is limited |
| Front cache | Your own CDN (optional) | Adobe-managed CDN always present |
| Validation | Manual / your CI | Mandatory SDK validator — invalid config fails the build |
| Editing defaults | You own everything | Immutable Adobe files you must not edit |
If you're starting a project today, you're almost certainly on AEMaaCS. If you maintain an older estate, it's AMS or on-prem. The rest of this guide explains both and calls out the differences as they come up.
Anatomy of the configuration
Classic AMS / on-prem: dispatcher.any
In the classic model the entire configuration lives in a single file, structured as a tree of /farms. A farm is the unit that ties a set of hostnames to a set of rules, and each farm declares everything about how requests for its sites are handled:
/farms {
/website {
/clientheaders {
"referer"
"user-agent"
"authorization"
"CSRF-Token"
}
/virtualhosts {
"www.example.com"
"example.com"
}
/sessionmanagement {
/directory "/usr/local/apache/.sessions"
}
/renders {
/rend01 { /hostname "publish1" /port "4503" }
/rend02 { /hostname "publish2" /port "4503" }
}
/filter {
# security rules — see below
}
/cache {
# caching rules — see below
}
/statistics { /categories { /html { /glob "*.html" } /others { /glob "*" } } }
/stickyConnectionsFor "/content/mysite"
/healthCheck { /url "/health.html" }
/retryDelay "1"
/numberOfRetries "5"
/failover "1"
}
}
A request is matched to a farm by its /virtualhosts (the incoming host header); once matched, that farm's filter, cache, and render rules take over. Running multiple farms lets you treat different sites — or author versus publish — completely differently within one Dispatcher.
AEM as a Cloud Service: the Dispatcher SDK
On AEMaaCS you never hand-write dispatcher.any. Instead you work in a standardized Dispatcher SDK project that compiles down to the same underlying configuration. The layout separates Apache config (conf.d) from Dispatcher config (conf.dispatcher.d), and — crucially — separates files you own from files Adobe owns:
src/
├── conf.d/ # Apache (httpd) config — MUTABLE
│ ├── available_vhosts/
│ │ └── default.vhost
│ ├── enabled_vhosts/ # symlinks → available_vhosts
│ ├── rewrites/
│ │ ├── default_rewrite.rules # IMMUTABLE (Adobe)
│ │ └── rewrite.rules # ← your vanity/redirect rules
│ └── variables/
│ ├── default.vars # IMMUTABLE
│ └── custom.vars # ← your variables
└── conf.dispatcher.d/ # Dispatcher config — MUTABLE
├── available_farms/
│ └── default.farm
├── enabled_farms/ # symlinks → available_farms
├── cache/
│ ├── default_rules.any # IMMUTABLE
│ ├── rules.any # ← your cache rules
│ └── ignoreUrlParams.any
├── clientheaders/
│ └── clientheaders.any
├── filters/
│ ├── default_filters.any # IMMUTABLE — the baseline deny/allow
│ └── filters.any # ← your filter customizations
└── virtualhosts/
└── virtualhosts.any
Three rules define how you work in this layout:
- Never edit the
default_*immutable files. Adobe ships and maintains them; they$includeinto your files, and you customize only the non-default ones. The validator will reject any edit to an immutable file. - Farms and vhosts are enabled via symlinks — you place a file in
available_farms/and symlink it fromenabled_farms/to turn it on. - The whole configuration is validated on every deploy. A broken config fails the pipeline, which means it fails before it reaches production rather than after.
Farm files
A farm is the central organizing unit of the Dispatcher — it binds a set of hostnames to a complete set of rules. Everything else in this guide (filters, cache, renders) lives inside a farm, so understanding how farm files are structured and selected is foundational.
On classic AMS/on-prem, all farms live in one dispatcher.any under a /farms block. On AEM as a Cloud Service, each farm is its own file: you place a .farm file in available_farms/ and turn it on by symlinking it from enabled_farms/ — the same available_* / enabled_* pattern used for virtual hosts. A file that isn't symlinked simply doesn't run, which makes enabling and disabling a farm a one-line change.
A single farm contains all of these building blocks:
| Section | Role |
|---|---|
/virtualhosts | The hostnames this farm answers for (how a request picks a farm) |
/clientheaders | The allow-list of request headers forwarded to publish |
/renders | The publish instances this farm load-balances across |
/filter | Security — which URLs are allowed through |
/cache | What is cached, and how |
/statistics | Categories used to weight load balancing |
/stickyConnectionsFor | Paths that should stick to one render |
/healthCheck, /retryDelay, /numberOfRetries, /failover | Resilience / load-balancing tuning |
Why run more than one farm? Because the rules often need to differ by site or audience. A multi-brand setup gives each brand its own farm (different vhosts, different cache rules); an environment that serves both author and publish through the same Dispatcher uses separate farms so author traffic is filtered and cached completely differently from public traffic.
The detail that ties farms to requests is farm selection: when a request arrives, the Dispatcher matches the incoming host header against each farm's /virtualhosts, and the first farm that matches handles the request. If two farms could match the same host, order matters — which is why farm file names are often prefixed with numbers to control evaluation order. (Virtual hosts get their own section next.)
Tip: On AEMaaCS, if a farm's rules don't seem to apply, first confirm the
.farmfile is actually symlinked fromenabled_farms/— an unlinked farm is silently inactive, and the validator won't flag it as an error.
The filter section: your security perimeter
This is the part of the Dispatcher that keeps your company out of the security headlines, so it deserves your full attention. The Dispatcher evaluates filter rules top to bottom, and the last matching rule wins. If no rule matches a request at all, the Dispatcher denies it by default. That default-deny behavior is your friend — it means a forgotten URL fails closed, not open.
The pattern every secure filter follows is deny everything first, then allow narrowly, and finally re-deny the dangerous edge cases:
/filter {
# 1) Deny everything first.
/0001 { /type "deny" /url "*" }
# 2) Allow only what the site needs.
/0011 { /type "allow" /method "GET" /url "/content/*" }
/0012 { /type "allow" /method "GET" /url "/content/dam/*" }
/0013 { /type "allow" /method "GET" /url "/etc.clientlibs/*" }
/0021 { /type "allow" /method "POST" /url "/content/*" /extension "form" }
# 3) Re-deny dangerous selectors/extensions on otherwise-allowed paths.
/0061 { /type "deny" /url "*" /selectors "feed" /extension "xml" }
/0062 { /type "deny" /url "*" /selectors "*" /extension "json" }
/0063 { /type "deny" /url "*" /selectors "*" /extension "*" /suffix "/*" }
# 4) Block traversal and internal nodes.
/0065 { /type "deny" /url "*..*" }
/0066 { /type "deny" /url "*/_jcr_content*" }
}
Notice the structure: a single deny-all, a short list of specific allows, and then a set of denies that close the holes those broad allows would otherwise leave open. Because the last match wins, the order is everything.
Filter properties you should know
Each rule is built from a small set of properties. Knowing exactly what each one matches against is what lets you reason about whether a rule does what you think:
| Property | Meaning |
|---|---|
/type | allow or deny. |
/url | Matches the request path (modern, preferred). |
/glob | Matches the full request line (GET /content/* *) — classic AMS style. The method is part of the glob. |
/method | HTTP method (GET, POST, …). |
/extension | File extension (html, json, css). |
/selectors | Sling selectors between the resource and extension (a.b in page.a.b.html). |
/suffix | Sling suffix after the extension (/x/y in page.html/x/y). |
/path | Deprecated — use /url. |
Patterns use glob syntax, where * matches any sequence of characters and ? matches a single character. A request-line glob such as "GET /content/* *" means method GET, a path under /content, any protocol — so the method is enforced as part of the same rule.
Why selectors and suffixes matter (DoS + leakage)
Sling's URL flexibility is a genuine security concern, and it's worth understanding why the filter goes to such lengths to constrain selectors. The same resource can be requested in countless ways — page.html, page.foo.html, page.1.2.3.html — and each unique URL becomes a separate file in the Dispatcher cache. An attacker who hammers page.<random>.html with endless distinct selectors can fill your disk and pollute the cache with junk: a cheap and effective denial-of-service. Worse, certain selectors like .infinity.json or .tidy.json cause Sling to serialize the repository itself into JSON. For both reasons, a hardened filter always constrains selectors and explicitly denies JSON traversal.
Caching deep dive
With security covered, the other half of the Dispatcher's job is caching. Here is a representative /cache block, which we'll then unpack setting by setting:
/cache {
/docroot "/mnt/var/www/html"
/statfileslevel "2"
/gracePeriod "2"
/serveStaleOnError "1"
/allowAuthorized "0"
/enableTTL "1"
/rules {
/0000 { /glob "*" /type "deny" }
/0001 { /glob "*.html" /type "allow" }
/0002 { /glob "*.css" /type "allow" }
/0003 { /glob "*.js" /type "allow" }
}
/ignoreUrlParams {
/0001 { /glob "*" /type "deny" }
/0002 { /glob "utm_*" /type "allow" }
/0003 { /glob "gclid" /type "allow" }
}
/allowedClients {
/0001 { /glob "*" /type "deny" }
/0002 { /glob "127.0.0.1" /type "allow" }
}
/headers {
"Cache-Control"
"Content-Type"
"Content-Disposition"
"Expires"
}
}
How file caching works
The Dispatcher cache is refreshingly simple under the hood. When a cacheable response comes back from publish, the Dispatcher writes it as a real file under /docroot, mirroring the URL's path on disk. The next request for that same URL is served by Apache straight from the file system — AEM is never involved. Invalidation works through .stat files: when content changes, the Dispatcher "touches" a .stat file, and any cache file older than that timestamp is treated as stale and re-fetched on the next request.
The settings that bite people
Most Dispatcher caching incidents come down to a handful of settings that are easy to get subtly wrong. These are the ones worth understanding deeply rather than copying blindly:
/statfilesleveldetermines how many directory levels up from a changed path get a.stattouch — in other words, how granular invalidation is. A value of0invalidates the entire site on every single activation, producing slow, thundering-herd re-caching. Set it to roughly match your content depth (commonly2–4) so that a change under/content/site/en/...invalidates only that branch./gracePeriodis the number of seconds a stale file may still be served while it is being invalidated. It smooths out invalidation storms by letting visitors keep getting a slightly old page rather than all hitting publish at once./serveStaleOnErrortells the Dispatcher to serve the last known-good cache file if publish is unavailable. It's a cheap, high-value resilience setting./allowAuthorized "0"prevents the Dispatcher from caching requests that carry an authentication header or cookie. Leave it at0unless you genuinely understand the consequences — caching authorized responses risks serving one user's personalized page to another./ignoreUrlParamslists the query-string parameters that should not create separate cache entries. This one is critical: without an allow-list,?x=1and?x=2are stored as two different cache files, which opens the door to cache fragmentation, cache-busting DoS, and cache poisoning. The safe pattern is to deny*and then allow only the specific parameters your application actually uses./headersdeclares which response headers are stored alongside the cache file and replayed on a cache hit.
TTL caching and AEMaaCS
On AEMaaCS, caching becomes a two-layer game: the Dispatcher cache and the Adobe CDN sitting in front of it. You control both primarily through the response headers your code and pages set, rather than through flush configuration:
Cache-Control: max-age=..., s-maxage=...—s-maxagedrives the shared (CDN and Dispatcher) TTL, whilemax-agedrives the browser's own cache.Surrogate-Control— a CDN-specific TTL that the browser never sees.Age— how long the cached object has been alive; one of your best debugging signals.
Set /enableTTL "1" so the Dispatcher honors the response's Cache-Control for its own TTL. The practical mindset shift on AEMaaCS is to think in TTLs and headers, not in flush agents.
Cache invalidation
On-prem / AMS: flush agents + /allowedClients
In the classic model, invalidation is push-based. Each publish instance runs a Dispatcher Flush replication agent that, on activation, POSTs an invalidation request to the Dispatcher:
POST /dispatcher/invalidate.cache
CQ-Action: Activate
CQ-Handle: /content/site/en/page
Content-Type: application/octet-stream
The /allowedClients block controls who is permitted to send these invalidation requests, and it is security-critical. An open /allowedClients — one that contains an allow "*" — lets anyone on the network flush your cache at will, which is a trivial denial-of-service. The correct pattern, exactly as with filters, is to deny * first and then allow only your known flush agents.
AEMaaCS: mostly TTL + CDN
On AEMaaCS the model flips. You generally rely on TTLs and the CDN rather than fine-grained flush agents — there is auto-invalidation on publish, but you don't manage /allowedClients flush IPs by hand the way you did on-prem. The practical implication is to design content to tolerate short TTLs and to use the CDN's purge mechanisms where they're available.
Client headers and response headers
Two header-related settings are easy to overlook and both carry security weight:
/clientheadersis the allow-list of request headers that the Dispatcher forwards on to publish. A wildcard"*"forwards everything — including a spoofableHostheader, which opens the door to host-header injection and cache poisoning. Forward only what you actually need:referer,user-agent, a CSRF token, and authentication headers where appropriate./cache /headersis the list of response headers cached and replayed on a hit. The one to watch here isSet-Cookie— caching it on a public page means a personalized response leaks into the shared cache.
Rewrite rules
Rewrite rules are Apache mod_rewrite directives — not part of the Dispatcher module itself — and they handle vanity URLs, redirects, host canonicalization, and HTTPS enforcement. On AEMaaCS they live in conf.d/rewrites/rewrite.rules. As the request-flow diagram showed, they run before the Dispatcher filter, so they shape the URL the rest of the pipeline sees.
# Canonical host + HTTPS
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule ^/?(.*)$ https://www.example.com/$1 [R=301,L]
# Vanity URL → real content path
RewriteRule ^/careers$ /content/site/en/careers.html [PT,L]
The flags carry the meaning: [R=301,L] issues a permanent redirect and stops processing, while [PT,L] ("pass through") rewrites the path internally — the browser URL stays /careers but AEM renders the real content path. Because rewrites run first, the filter sees the rewritten path: a vanity rule that points at a path your filter denies will still be blocked, so keep rewrite targets and filter allows consistent with each other.
Note: Prefer doing redirects and URL shortening in
mod_rewriteat the Dispatcher rather than inside AEM. It's faster (no publish round-trip), it's cacheable, and it keeps URL logic in one place. AEM's own vanity URLs andsling:aliasare fine for small cases, but heavy redirect maps belong in rewrite rules.
Virtual hosts
Virtual hosts are how the web tier decides which site — and which farm — a request belongs to, based on the incoming host header. There are actually two related layers, and keeping them distinct avoids a lot of confusion.
The Apache virtual host (an .vhost file in conf.d/available_vhosts/, symlinked from enabled_vhosts/) is standard httpd configuration: it declares the ServerName and ServerAlias hostnames Apache answers for, the port, and TLS settings, and it includes the rewrite and dispatcher configuration.
<VirtualHost *:80>
ServerName www.example.com
ServerAlias example.com
# include rewrites, then hand off to the dispatcher
</VirtualHost>
The Dispatcher /virtualhosts block (inside a farm, e.g. virtualhosts.any) is what the Dispatcher module matches a request against to pick a farm:
/virtualhosts {
"www.example.com"
"example.com"
"*.example.com"
}
So a request flows host-header → Apache virtual host (does Apache serve this hostname?) → Dispatcher /virtualhosts (which farm handles it?). A few practical points: list every hostname a site is reached by, including bare and www variants and any wildcards; keep a sensible default/fallback vhost so unmatched hosts get a controlled response rather than a random farm; and remember that a missing or mis-ordered vhost entry is a classic cause of "the wrong site is being served" or "my farm's rules aren't applying."
Important: Virtual host matching is also a security control. Forwarding an attacker-controlled
Hostheader through to publish enables host-header injection and cache poisoning — which is why you constrain/clientheaders(above) and define explicit virtual hosts rather than accepting any host.
The security checklist (memorize this)
This is the canonical hardening checklist for an AEM Dispatcher. Every item below represents a class of attack; run each one against your filter and confirm that none of them is reachable:
- Admin consoles —
/crx/*,/system/console/*, and/bin/*(except specifically whitelisted servlets). These expose the repository and runtime internals. - Repository dumps —
*.infinity.json,*.tidy.json,*.-1.json,*.children.json, and any*.jsonreached with selectors. These serialize content as JSON. - QueryBuilder over GET —
/bin/querybuilder.jsonallows arbitrary repository queries over a GET request, enabling mass data exfiltration. - Source exposure —
/apps/*and/libs/*(allow only specific client libraries). These leak your application and product code. - Traversal and internals —
..,/_jcr_content, and/jcr:content, which reach internal nodes and denied trees. - Write surface — arbitrary
POSTto/content/*, which can trigger unintended writes. - Selector DoS — unbounded selectors on cacheable extensions, the cache-flooding attack described earlier.
- Feeds and listings —
*.feed.xmland*.rss, which can expose listing data. - Host-header injection — don't forward the
Hostheader blindly via/clientheaders.
Automate it: rather than checking these by hand, paste your filter into the Dispatcher Tester and open the Security audit tab — it runs every probe above against your rules and grades the result.
AEMaaCS specifics every developer hits
A few cloud-specific realities trip up developers coming from 6.5, and they're worth stating plainly:
- Immutable vs mutable files. You customize
filters.any,rules.any,rewrite.rules,virtualhosts.any, andclientheaders.any— and you never touch thedefault_*files, which$includeinto yours. Editing an immutable file fails the build. - Validate locally before you push. The SDK ships a validator and a Docker runner so you can reproduce the pipeline's checks on your own machine:
# Validate the whole config (fails like the pipeline would)
$ ./bin/validator full -d ./out ./src
# Run the Dispatcher in Docker against a local publish/SDK
$ ./bin/docker_run.sh ./src host.docker.internal:4503 8080
- Deploys fail on invalid config. Treat this as a feature, not a hurdle — a broken Dispatcher configuration never reaches production.
- The cache is two layers deep. Because the CDN sits in front of the Dispatcher, a stale page you're seeing might be cached at the CDN, not the Dispatcher — a distinction that matters a lot when debugging.
Cache debugging & troubleshooting
Cache issues are the most common Dispatcher problems you'll be asked to solve, so it's worth having a repeatable method instead of guessing. Almost all of it comes down to reading response headers and checking whether a file exists in the docroot. Start with these signals:
| Signal | What it tells you |
|---|---|
Age | Seconds the object has been cached. 0 or missing → likely a miss/uncached. |
Cache-Control / Surrogate-Control | The TTL the origin requested. |
X-Cache: HIT/MISS | CDN hit/miss (AEMaaCS). |
Set-Cookie on a public page | A caching red flag — personalized response leaking into cache. |
File present under /docroot? | On-prem: definitive proof a URL was cached. |
Dispatcher logs (loglevel ↑) | Why a request was denied or not cached. |
The first 60 seconds
Whatever the symptom, begin the same way. Fetch the URL with curl -I and read the headers, then localize the layer before changing anything — because the fix is completely different depending on where the bad copy lives:
Browser → CDN → Dispatcher → Publish
│ │ │ │
disk X-Cache docroot Age=0 / fresh render
cache HIT/MISS file?
Walk it outside-in: is it the browser (hard-refresh / incognito to rule out), the CDN (X-Cache, Age), the Dispatcher (is there a stale file in /docroot?), or publish itself (does publish render the right thing directly)? Identify the layer first; fix second.
Problem 1 — "My content change isn't showing up"
The most frequent complaint, and almost always a stale cache. Diagnose in order:
- Confirm publish is correct. Hit the publish instance directly (bypassing the Dispatcher/CDN). If publish is wrong, it's not a cache problem — it's replication; re-activate the page.
- Check the Dispatcher. Look for a stale file in
/docrootfor that path. If it's there, the flush didn't happen — verify the Dispatcher Flush agent fired on activation and that/statfileslevelinvalidates the right branch. - Check the CDN. A non-zero
AgeandX-Cache: HITmean the CDN is serving an old copy; it will expire by TTL or needs a purge. - Check the browser. Rule it out with a hard refresh or incognito.
Problem 2 — "The page is slow / never caches"
Here the page renders correctly but every request hits publish (Age stays 0, no docroot file). Common causes:
- A cache rule denies it. Confirm the path/extension is allowed in
/cache /rules. - A
Set-Cookieon the response. The Dispatcher won't cache a response that sets a cookie — find and remove the stray cookie on the public page. Cache-Control: no-cache/privatecoming from publish, ormax-age=0with/enableTTL "1".- Query-string fragmentation. Tracking params not in
/ignoreUrlParamsmake every URL unique, so nothing reuses a cache entry. Add an allow-list. /allowAuthorized "0"plus an auth header/cookie — authorized requests are intentionally not cached.
Problem 3 — "Users see the wrong / someone else's content"
The most serious class, because it's a data-leak. Usually a personalized response got cached and served to everyone:
- Look for
Set-Cookieon a cached public page — the smoking gun. - Confirm
/allowAuthorized "0"so authenticated responses are never cached. - Check that
/clientheadersisn't forwarding a spoofableHost(host-header poisoning). - Make sure personalized fragments are excluded from caching (assembled at the edge or on the client), not baked into a cached page.
Tip: Reproduce filter and cache decisions without a running Dispatcher by pasting your rules into the Dispatcher Tester — the single-URL trace shows exactly which rule allowed, denied, or cached a path, which is often faster than reading logs.
Performance best practices
Performance on AEM is overwhelmingly a function of how well the Dispatcher caches, so these habits matter:
- Cache as much as you can. The goal is a high cache-hit ratio, because every miss runs Java on publish.
- Keep
statfileslevelsane so each activation invalidates a branch rather than the whole site. - Normalize URLs — trailing slashes, and query parameters via
ignoreUrlParams— so you don't fragment the cache into near-duplicate entries. - Cache client libraries aggressively with long TTLs; they're versioned by content hash, so a long TTL is safe.
- Enable
serveStaleOnErrorfor resilience when publish has a bad moment. - Never cache personalized or authorized responses. Split personalized fragments out and assemble them at the edge (SSI) or on the client instead.
Do's and Don'ts
Do
- ✅ Start every filter with deny
*, then allow only specific paths. - ✅ Explicitly deny
.jsonselectors,.infinity, QueryBuilder, and admin consoles. - ✅ Maintain an
ignoreUrlParamsallow-list for query parameters. - ✅ Restrict
/allowedClientsto your known flush agents (on-prem). - ✅ Validate the configuration in CI (AEMaaCS does this for you automatically).
- ✅ Keep a test suite of critical allowed and blocked URLs, and re-run it on every change.
Don't
- ❌ Don't run a default-permissive filter "just for now."
- ❌ Don't forward all client headers with
/clientheaders "*". - ❌ Don't set
/allowAuthorized "1"without fully understanding the consequences. - ❌ Don't edit the AEMaaCS immutable
default_*files. - ❌ Don't cache responses that carry
Set-Cookie. - ❌ Don't verify security by eye — probe it.
Common pitfalls
A few failure modes show up again and again. Recognizing them saves hours:
- "Last match wins" surprises. A broad allow placed after a specific deny silently re-opens what you closed. When a rule seems ignored, trace the full evaluation order.
- Selectors slipping through.
page.model.jsonstays reachable if you denied*.jsonbut forgot to constrain/selectors. - Cache fragmentation from tracking parameters like
utm_*andgclidthat you forgot to add toignoreUrlParams. - A vanity rewrite pointing at a denied path, so the page 404s through the filter even though publish would render it.
statfileslevel 0causing full-site invalidation storms on every activation.- Editing an immutable file on AEMaaCS and failing the deploy with a validator error.
Cheat sheet
Minimal secure filter skeleton
/filter {
/0001 { /type "deny" /url "*" }
/0011 { /type "allow" /method "GET" /url "/content/*" }
/0012 { /type "allow" /method "GET" /url "/etc.clientlibs/*" }
/0061 { /type "deny" /url "*" /selectors "*" /extension "json" }
/0065 { /type "deny" /url "*..*" }
/0066 { /type "deny" /url "*/_jcr_content*" }
}
Cache hardening defaults
/statfileslevel "2"
/gracePeriod "2"
/serveStaleOnError "1"
/allowAuthorized "0"
/ignoreUrlParams { /0001 { /glob "*" /type "deny" } /0002 { /glob "utm_*" /type "allow" } }
Headers to check when debugging
Age: how long cached (0/absent → miss)
Cache-Control: origin TTL (max-age / s-maxage)
Surrogate-Control: CDN-only TTL
X-Cache: HIT/MISS at the CDN (AEMaaCS)
Set-Cookie: must NOT appear on cached public pages
AEMaaCS validator commands
./bin/validator full -d ./out ./src # validate config
./bin/docker_run.sh ./src host.docker.internal:4503 8080 # run locally
Security probes that must all be denied
/crx/de/index.jsp
/system/console/bundles
/bin/querybuilder.json
/content.infinity.json
/content.tidy.-1.json
/content.feed.xml
/etc.json
/apps.json
/content/dam/x.jpg/_jcr_content/renditions
POST /content/site/page
Wrapping up
For all its reputation, the Dispatcher rewards a simple discipline: deny by default, allow narrowly, cache aggressively, and verify with tests. Internalize the request flow (rewrite → filter → cache → render), the last-match-wins rule, and the security checklist, and you'll confidently handle the vast majority of real-world Dispatcher work — on both AMS and AEM as a Cloud Service.
When you're ready to validate your own configuration, run it through the AEM Dispatcher Tester & Security Audit — a single-URL trace, batch test suites, and an automated security audit, all in the browser.
Further reading
Subscribe to the Newsletter
Get the latest articles, tutorials, and tech insights delivered straight to your inbox. No spam, unsubscribe anytime.