Caching

What is caching?

A software component that stores data, that the future requests for that data can be served faster (infographic).

![IMPORTANT]

Caching a GraphQL API is different than other endpoint based APIs (e.g. things like REST).

HTTP Caching

It is harder in GraphQL.

POST /graphql HTTP 1.1
content-type: application/json
'
  {
    "query": "{ shop { name } }",
    "variable": null,
    "operationName": null
  }
'

With POST HTTP method we cannot use normal HTTP caching. But GraphQL Spec also do not ask us to only use POST. So we can utilize GET HTTP verb.

How Much Useful is HTTP Caching?

It is really great when:

We have immutable resources with a very long max-age.
- Things like JS/CSS assets.
- max-age is a year in the future.
- Add a hash or version number to the file name, thus whenever wanna deploy a new version you can simply rename it. Then your client knows that they need to fetch the new version.
We have mutable resources with a very clear expiration date.
Our client is browsers;
- Set the proper headers on your backend
- Browsers will take care of the rest¹ for you. No coding required.

[!CAUTION]

It is not good if:

Our resources are mutable, and server needs to revalidate the cache. Meaning the request has to go to the server anyway.

And to do so we need to add etags, or last-modified headers to our requests.

And most APIs are like this!

You have different clients that ain’t browsers.

Utopia of Clients of a Backend App

They do not need to do anything to benefit from caching. This means that we need to take care of it at the application level :).

Application Level Caching – Caching Approaches

[!NOTE]

Make sure to read the consideration section!

Caching a single resolver:

Here this particular resolver is probably super slow, so we wanna make it faster.
Caching frequently accessed data:
- Our app uses the cache instead of querying it from the underlying DB.
- E.g. IdentityCache which caches data in a Memcached.
Cache all Queries.
- We can do this by resolver level caching with directives.
- In Apollo Serve it is well known as per-field basis server-side caching.

[!NOTE]

It is kinda I guess obvious that we do not wanna cache Mutations and Introspection*.

*If it is enabled and accessible.

Caching Queries

We say whether an Object is cacheable or not.
Cache key is dictating “cache miss”.
Cache key structure: AppName:Query:Variables:OperationName

[!IMPORTANT]

Query and Variables need to be first normalized and then hashed. We use the hashed value in this string of course.

Examination of When to Use Caching

Measure how many request you have.
How many of them are cacheable.
How many ain’t.
How frequently you cache gets invalid.

If you have a ton of cacheable requests, then if you implement it right, it will have a big impact.

Considerations

Query complexity (and query depth) is important since:

Our cache storage might run out of space if we cache a lot of queries.
Or on the other hand we might be kicking cached data out of our storage too soon if the cardinality² of our queries are too high.

Note This won’t be an issue if you’re serving internal clients since the variety of queries won’t be out of control.

[!NOTE]

Learn more about query complexity in NestJS here and query depth in NodeJS here.

Ref

Scott Walkinshaw – Caching GraphQL APIs.

Footnotes

Read, write and invalidation. ↩
Variety . ↩