AEM Dispatcher Cache Invalidation for Multiple Cache Farms

Imagine you have an Adobe Experience Manager setup hosting multiple websites. This is where AEM really shines and is common practice at most companies that host their websites with AEM.

Imagine you have an Adobe Experience Manager setup hosting multiple websites. This is where AEM really shines and is common practice at most companies that host their websites with AEM. The problems occur when there is also a different content structure in AEM for each website, along with different needs for cache settings and so on.

Setting up the web server

Setting up a webserver with the Adobe Experience Manager (AEM) dispatcher module for all your websites is pretty simple when you have the same configuration for all virtual hosts regarding statsfilelevel, publish servers (renders), cache (docroot) directory and invalidation rules.

Here is what an example set up might look like:

An example for a multiple dispatcher webserver setup

And these are the requirements for the above mentioned setup:

Further restrictions for setting up the web server

For the configuration of the flush agent you cannot use the domain name of the load balancer.
Because of the load balancer in front of your dispatcher webservers you will never know which web server cache will be invalidated this way.
You could work with custom headers at that point, so the load balancer can determine to which server the request should be sent, but that may be a long way if you do not have full control of the load balancer yourself.

For these requirements you need to split up the dispatcher configuration in multiple farms. You can use the hostname globbing in the dispatcher module to determine how the request should be handled.

The setup and the solution I describe here may be a very special case regarding the setup and number of restrictions, but I may not be the only one running into it.

Solution for this setup

Luckily the invalidation request serves as the CQ-Path header (that represents the CRX path of the content that should be flushed) which we can use to determine which website's cache directory should be invalidated.
We configure our flush replication agent to point to the dispatcher webserver. One for every webserver instance.
So now we know the content path and the website it belongs to.
Changing the host header for the invalidation request in the web server will do the rest and the invalidation will work properly.

The solution in technical details

The following config can easily be added to the webserver configuration as it is processed before the request hits the dispatcher module.

With the LocationMatch we will only treat requests for the invalidation so we do not interfere with regular requests that serve the content.
With SetEnvIf we set the environment variable FLUSH_HOST depending on the CQ-Path header to the value of the domain name of the website.
This can be easily extended for a large number of domains and will work as long as the content path is different for the each of the domains.

<LocationMatch "^/dispatcher/invalidate.cache$">
     # domain A
     SetEnvIf CQ-Path “.*/content-path-of-domain-A/.*" FLUSH_HOST=domain-A
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
     # domain B
     SetEnvIf CQ-Path “.*/content-path-of-domain-B/.*" FLUSH_HOST=domain-B
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
 </LocationMatch>

This solution eliminates the need to set up more webserver instances than necessary to fulfil the requirements mentioned in this article. The AEM dispatcher set up is also described in detail on the Adobe Website.