Accept-Encoding, It’s Vary important.
One of the best things about running BootstrapCDN are the new things I've learned about web performance. Today, I'd like to share a few insights gained while resolving an issue originally brought up by a peer. The concern was to include the Vary: Accept-Encoding header -- it's not on every server response, but it should be. Here's why.
When browsers make a request, they include HTTP headers for the server to decide what to send back (Is this a mobile client? Can it handle compressed content? Does it need a certain language?).
That's great for direct access, but modern networks use intermediate caches and CDNs. And there's the problem: how does the cache use headers to decide what to send back? How can it replicate the server's decision-making logic?
Vary to the rescue. The Vary header describes what information "uniquely" identifies a request -- caches should only be used if the incoming request matches the Vary information in the cache.
For example, if a server sends the Vary: User-Agent header, intermediate caches will store a separate cache entry for each User-Agent they see (every OS + browser combination, yikes). This behavior was an issue for me in support (we're hiring!), because we saw origin servers getting hammered as each user-agent requested new content and sidestepped the cache. After some research, I figured out why this happened (turn off Vary: User-Agent), but the header left a bad taste in my mouth.
Well, the BootstrapCDN issue came up, so I decided to give Vary another look. I went to one of my favorite WebPerf education sites and found this article that explains gzip & Accept-Encoding. After following the communication graphs, I thought "If the Origin Server, CDN and Browser support gzip-encoding, why is an extra header needed?"
"If for some reason the client has an uncompressed version of the file in its cache, it will know not to subsequently request a compressed version of it again and instead to just use the uncompressed file from the cache." - Kyle Rush
"Like Kyle said... just replace "client" with a "upstream proxy" (isp, corporate network, etc). So you have the risk of serving uncompressed version to end user that supports gzip, and vice versa." - Sajal KayanImagine two clients: an old browser without compression, and a modern one with it. If they both request the same page, then depending on who sent the request first, the compressed or uncompressed version would be stored in the CDN. Now the problems start: the old browser could ask for a regular "index.html" and get the cached, compressed version (random junk data), or the new browser could get the cached, uncompressed version and try to "unzip" it. Bad news, either way. The fix is for the origin server to send back Vary: Accept-Encoding. Now the intermediate CDNs will keep separate cache entries (one for Accept-encoding: gzip, another if you didn't send the header). These days you're unlikely to have clients without compression, but why risk cache mixups? Origin servers should include Vary: Accept-Encoding, and here's how:
Header append Vary: Accept-Encoding
<add name="Vary" value="Accept-Encoding"></add>