How GIPHY uses Fastly to Achieve Global Scale
April 20, 2021 by

GIPHY serves a lot of GIF media content. Over 10 billion pieces of content per day, in fact. In addition to media requests, which represent the actual download of GIFs, we also provide public API and SDK services for developers to use in their products, which gives their users access to our enormous library.

As with many tech companies with large daily traffic volume, we have scalability challenges. Our systems have to be able to handle a high volume of requests (in the 100K requests per second range) and have a low latency response. There’s nothing worse than waiting for something to load — especially a GIF!
This is where an edge cloud platform plays a role: instead of making our AWS servers handle every request that comes our way, an edge cloud platform caches as much of the media content and search result JSON payload as possible. This works well because neither media content, nor API responses change often. The edge cloud platform servers also distribute the request load among various regions. We use Fastly as our edge cloud platform provider to help us serve billions of pieces of content to our users.
Fastly Solution
Fastly provides a variety of features that allow us to deliver content at scale. These features can be broadly categorized as:
– Cache Layering
– Cache Management
– Edge Computing
Cache Layering
A basic edge cloud platform set up has the content cached at the edge. These server nodes are distributed globally and deliver the cached content to users making requests in their region. In the event the edge node does not have the content, it will make a request to our origin server to retrieve it.

This single layer setup has a drawback. Each edge node maintains its own cache based on the requests from its region. So a new piece of content may not be cached on any of the edge node which could lead to surges in traffic to our origin servers as each edge node repeats a request for the same content. Viral content often exhibits this behavior as its popularity gains traction.
Fastly offers a second layer of cache service called the Origin Shield. Edge nodes that do not have the requested content in cache can now retrieve it from the Origin Shield layer with the request only needing to reach our origin server if the Origin Shield does not have it.

Cache Management
Now that the content is cached on the edge and Origin Shield, we need ways to manage their caching policies. Not all content should stay cached for the same duration, or TTL (Time to Live). For example, the information on an individual GIF will not change that much, so its API response can be cached over a relatively long period of time. On the other hand, the API response for the Trending Endpoint, which returns a continuously updated list of currently trending GIFs, would need to be on a short TTL due to the nature of trends.
Fastly is powered by Varnish, so all of the configurations are executed as Varnish Configuration Language (VCL) code. Both the edge and Origin Shield runs VCL code, so we are able to set up various cache TTLs based on API endpoint paths with some simple VCL code:
# in vcl_fetch if (req.url ~ "^/v1/gifs/trending") { # set 5 minute ttl for trending responses set beresp.ttl = 600s; return(deliver); }
The cache TTL does not always have to be set by VCL code. API requests that reach origin, when a cacheable item is missing or stale, can have cache control instructions encoded in their responses from Origin. We just need to simply setup the VCL code so that it can be overridden. From origin, we can propagate this decision to Fastly’s Origin Shield and edge nodes by setting cache control headers in the API response. Specifically, the Surrogate-Control header because this header will only be for Fastly nodes. So, we can update the above VCL to prioritize the Surrogate-Control over the endpoint cache policies like this:
# in vcl_fetch if (beresp.http.Surrogate-Control ~ "max-age" || beresp.http.Cache-Control ~ "(s-maxage|max-age)" ) { # upstream set some cache control headers, so Fastly will use its cache TTL return(deliver); } else { # no cache headers, so use cache policies for endpoints if (req.url ~ "^/v1/gifs/trending") { # set 10 minute ttl for trending responses set beresp.ttl = 600s; return(deliver); } }
With this setup, we can have cached content invalidate themselves with dynamic TTL policies that meet our needs — but we also need to invalidate cache explicitly if we don’t want to wait for them to naturally expire. We could simply invalidate the cache by the cache key (URL). This works well for media, but it is a bit more complicated for API responses.
For example, our API search endpoint can return the same GIF for different queries, but it isn’t feasible for us to know every possible URL that yielded that GIF if we wanted to invalidate it:
# same GIF can appear in the response of all of these API calls https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=haha https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=hehe https://api.giphy.com/v1/gifs/search?api_key=__KEY2__&q=lol https://api.giphy.com/v1/gifs/search?api_key=__KEY3__&q=laugh
For this situation, we take advantage of Fastly’s Surrogate Keys! As the name suggests, a surrogate key can uniquely identify cached content, in much the same way the cache key does. Unlike the cache key, there can be multiple surrogate keys per stored result, and we can set the surrogate keys. Using the GIF IDs that appear in each API response gives us a way to identify multiple pieces of cached content that contain a given GIF:
# same GIF (gif_id_abc) can appear in the response of all of these API calls https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=haha Assign Surrogate Key: gif_id_abc https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=hehe Assign Surrogate Key: gif_id_abc https://api.giphy.com/v1/gifs/search?api_key=__KEY2__&q=lol Assign Surrogate Key: gif_id_abc https://api.giphy.com/v1/gifs/search?api_key=__KEY3__&q=laugh Assign Surrogate Key: gif_id_abc
We can even attach multiple surrogate keys to the same content:
https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=haha Assign Surrogate Key: gif_id_abc gif_id_def key_KEY1 q_haha https://api.giphy.com/v1/gifs/search?api_key=__KEY1__&q=hehe Assign Surrogate Key: gif_id_abc gif_id_123 key_KEY1 q_hehe https://api.giphy.com/v1/gifs/search?api_key=__KEY2__q=lol Assign Surrogate Key: gif_id_abc, gif_id_321 gif_id_456 key_KEY2 q_lol https://api.giphy.com/v1/gifs/search?api_key=__KEY3__&q=laugh Assign Surrogate Key: gif_id_abc key_KEY3 q_laugh
Surrogate keys are a powerful feature that allows us to select the appropriate cache to invalidate with great precision and simplicity. With this setup, we are able to invalidate cache for situations such as:
– Invalidate all cached API responses that contain a specific GIF
– Invalidate all cached API responses that are for a specific API key
– Invalidate all cached API responses that queried for certain words

Running Code At The Edge
VCL provides a lot of versatility in what we can do in the edge cloud platform configuration. We showed before how the configuration can set various cache TTL policies for the edge and Origin Shield nodes, but we can also use VCL to set the request information.
We can have code to rewrite the incoming request URL. This comes in handy when we need to make changes to our API endpoints without troubling our consumers to update their calls.
# in vcl_recv if (req.url ~ “^/some-old-endpoint”) { # rewrite to the new endpoint set req.url = regsub(req.url, “/some-old-endpoint”, “/new-and-improved-endpoint”); }
We can even select a percentage of the incoming requests to test experimental features. By using Fastly’s randomness library we can add a special header to some of our requests that enables new behaviour in our origin server.
# in vcl_recv set req.http.new_feature = 0 if (randombool(1,10000)) { # .01% of the traffic gets to see the new feature set req.http.new_feature = 1; }
This combined with Fastly’s edge dictionaries allow us to set up different behaviors with minimal code.
# API keys that will have a percentage of their request use the new feature table new_feature_access { “__API_KEY1__”: “1”, “__API_KEY2__”: “5”, “__API_KEY3__”: “1000”, } sub vcl_recv { set req.http.new_feature = 0 # check if request has an api key that is setup to have a percentage of its requests use the new feature if (randombool(std.atoi(table.lookup(new_feature_access, subfield(req.url.qs, "api_key", "&"), "0")) ,10000)) { set req.http.new_feature = 1; } return(lookup); }
This is just scratching the surface of what VCL enables. Fastly’s documentation can be found here if you want to see what else is possible!

Tips and Tricks
We use a lot of Fastly features to power the world with animated GIF content. However configuring the edge cloud platform can be quite complex when there is so much functionality at your disposal, so here are some tips and tricks we recommend to help you along the way.
VCL Execution In Edge and Origin Shield
With a two layer cache setup, one key thing to remember is the same VCL code will execute on both the edge and Origin Shield. This can cause unexpected outcomes if the VCL code is changing request/response state information.
For example, our VCL code from before would set our cache TTL, based on cache control headers from upstream or specified in the VCL code itself, for both the Origin Shield and edge nodes:
# in vcl_fetch if (beresp.http.Surrogate-Control ~ "max-age" || beresp.http.Cache-Control ~ "(s-maxage|max-age)" ) { # upstream set some cache control headers, so Fastly will use its cache TTL return(deliver); } else { # no cache headers, so use cache policies for endpoints if (req.url ~ "^/v1/gifs/trending") { # set 10 minute ttl for trending responses set beresp.ttl = 600s; return(deliver); } }
Suppose for the trending endpoint, we also set the response’s Cache-Control header so we can instruct the caller on how long to cache the content on their side. This could simply be done as:
# in vcl_fetch if (beresp.http.Surrogate-Control ~ "max-age" || beresp.http.Cache-Control ~ "(s-maxage|max-age)" ) { # upstream set some cache control headers, so Fastly will use its cache TTL return(deliver); } else { # no cache headers, so use cache policies for endpoints if (req.url ~ "^/v1/gifs/trending") { # set 10 minute ttl for trending responses set beresp.ttl = 600s; # set 30 second ttl for callers set beresp.http.cache-control = "max-age=30"; return(deliver); } }
The Origin Shield would have executed this VCL code and added the Cache-Control header to the response’s header and returned it to the edge. On the edge however, it would have seen that the Cache-Control is set in the response and would have executed the if-statement. This would have resulted in the edge nodes using a cache TTL of 30 seconds instead of the intended 10 minutes!
Fortunately, Fastly provides a way of distinguishing between the edge and Origin Shield by setting a header (Fastly-FF) in the request:
# in vcl_fetch if (req.url ~ "^/v1/gifs/trending") { # set 10 minute ttl for trending responses set beresp.ttl = 600s; return(deliver); } # in vcl_deliver if (!req.http.Fastly-FF) { # set 30 second ttl for callers set resp.http.cache-control = "max-age=30"; }
With this addition, the Cache-Control header would only be set at the edge node and our cache policies are behaving as expected again!
Debugging and Testing
The pitfall we just mentioned can be quite difficult to detect and debug. The VCL code would just run on a server and present you with the response and response headers. We can simply add debugging information into custom headers and view them in the response, but this can get unwieldy quick.
Fortunately, there is the Fastly Fiddle tool, which provides better visibility into what the VCL code does when it executes. We can simulate the various VCL code parts in this tool and get more information on how Fastly’s edge and Origin Shield servers will behave with the VCL code.
Here is the fiddle of the above example that shows the double execution of the VCL can affect the cache TTL.
We set up the VCL in the appropriate sections on the left, and execute it to see how Fastly would have handled the request on the right:

The picture above shows a lot of useful information about the life cycle of the request as it goes through the edge and Origin Shield node. In a real world setting, the VCL code can be very complex and this tool really shines in situations like this.
— Ethan Lu, Tech Lead, API Team