What is Cache?

Caching means storing items in a temporary location, or cache, to be accessed more easily later. The term cache in this sense is general. For instance, the researchers in the first successful expedition to the South Pole stacked, or cached, food supplies along the route on their way there, so they could consume it while returning. Such caching was much more practical, or even feasible, than having the supplies explicitly delivered from their base camp.

In computing, caching serves a similar purpose: when data is needed, it needs to be fetched from the storage to the place of processing. Since such transfer is generally slow, we often—after having transferred the data to the place of processing for the first time—cache a copy of it next to the processing unit. If it turns out that we need it later, we save a storage round-trip which generally yields a considerable reduction in latency.

Examples in computing include caching memory blocks closer to CPU so we save round-trips between CPU and RAM, caching domain name resolutions in DNS resolvers so that we save (multiple) round-trips between the resolver and the root, the top level domain and authoritative name servers, or caching HTTP content in CDN so that clients can load contents quickly without having to fetch it from origin servers.

In this article we focus on HTTP and CDN caching in particular.

Private, shared and managed caching in HTTP

We have two types of caches in HTTP, private and shared ones. A private cache is a cache specific to a particular client, like a web browser. Private caches can store client-specific content that may contain sensitive information.

A shared cache, on the other hand, resides between the client and the origin server and stores content that is shared among multiple clients; such caches may not contain sensitive (personal) content. The most common type of a shared cache is a managed cache. These are purposely set-up to reduce the load on the origin server and to improve content delivery. Examples include reverse proxies, CDNs, and service workers in combination with the Cache API.

Managed caches are called managed because they allow managing cached contents explicitly; either through administration panels, service calls or a similar mechanism. This is in contrast to other types of caches which are managed by the server only through the Cache-Control response header field. The following are a few example capabilities one may find in a managed cache:

  1. Cache purging. A managed cache can purge cached contents on demand.
  2. Cache directive override. A managed cache may offer custom caching policies overriding the ones put forth by the origin server. For instance, a managed cache can be set to always (or never) cache certain resources, regardless how the origin server sets their Cache-Control.
  3. Cache expiry manipulation. Similar to the above, a managed cache might override the duration before the cached resources become stale.

What is Caching and How does cache works

The benefits of caching

Even though many websites today are dynamic, meaning they generate content on-the-fly using data from a database, they still contain a lot of static content. Examples include images, video and audio files, JavaScript and CSS resources, file archives and similar. If such resources are cached on locations closer to the end-users, such as on a CDN, the origin server saves bandwidth while the end-users receive content faster and more reliably.

The caching mechanism

When a content is requested for the first time, the request is sent to the origin server which returns requested resources. For every resource that is returned, the origin server sets caching policy which tells if the resource can be cached, where, and for how long.

If a request is made for a resource that is in the cache, a cached copy is used: either from a private cache in the browser or from shared one in the CDN: either a full HTTP request is saved, or an HTTP request is sent to a CDN that is closer than the origin server and can respond faster.

A resource in cache is in one of two states: either fresh or stale. A fresh resource is valid and can be used immediately while a stale resource has expired and needs to be validated.

The driving factor in deciding whether a resource is stale is its age. In HTTP, this can be established either by examining the time it was fetched, or, even simpler, by inspecting resource’s version number.

When a cached resource becomes stale, we do not discard it immediately. Instead, its freshness can be restored by asking the origin server if the said resource has changed. This is achieved with a conditional HTTP request that contains an extra header specifying either i) the date when the cached resource was created, or ii) the version of the content that is cached.

If it turns out that the the stale resource did not change, the origin server responds with a response code 304 Not Modified; this is the origin server’s feedback stating that the cached version is up-to-date. Note that this is a very short message: the 304 Not Modified is a head-only response; it contains no body. But if the resource was changed, the server returns a normal 200 OK response where the response body contains the updated resource.

This process is called validation, or sometimes revalidation.

Glossary

HTTP

HTTP is a protocol used to connect to web servers by web browsers to request content to view. This is also used to transfer larger files, and is often used for software updates.

CDN

A CDN, or "Content Delivery Network," is a network of servers (typically placed around the world) used for the purpose of delivering content (videos, photos, CSS, etc..).