Scaling GIF Data Storage to Infinity and Beyond!
January 23, 2018 by
Everyday here at GIPHY, we have GIFs on GIFs on GIFs uploaded to our platform. In fact, we serve over three billion GIFs a day (that’s a lot of dancing cats!) to over 300 million daily active users. Every upload, however, brings new information and an increasingly large amount of data—this can become a lot to manage. Using database services like MySQL and DynamoDB, we’re able to organize this heavy amount of data in a high-performing way.
Scaling our database infrastructure
To understand how much data we’re talking about, let’s first walk through exactly what information is behind every GIF. For starters, size matters. Each GIF uploaded generates a good amount of data that we need to manage. GIFs are transcoded into different sizes (called renditions) which are optimized for different screen types: tablets, desktops and phones, etc. In addition, our machine learning models analyze and annotate each GIF with a lot of data, such as what celebrity might be in the GIF, what its MPAA content rating is, and how it relates to other GIFs in our catalog. Along with all this metadata come associated performance costs that could hinder our ability to scale and grow.
To address this challenge we turned to a datastore solution by AWS called DynamoDB and found it to be a perfect companion to our existing data store. With DynamoDB, we had the flexibility to continuously grow the size of our data store without incurring these performance costs. Combined with DynamoDB Accelerator (DAX), we achieved better performance than with a traditional relational database.
Success with DynamoDB
Over a period of several weeks, we migrated all of the metadata accumulated from different GIF sizes from MySQL to DynamoDB. At each step, we measured the performance impact on the system and witnessed an overall improvement in read functionality of around 25%. As a result of moving high frequency data from MySQL to DynamoDB, overall system performance was improved. The below diagram illustrates the role of DynamoDB in our stack:
For any company experiencing rapid user growth, scaling data can be a hard and daunting task. Here at GIPHY we were pleased to learn that AWS offers a variety of services that made sure our systems could adapt for more user growth. In our case, RDS with a Dynamo/DAX solution was the perfect combination to alleviate our growing pains.
— Alex Hoang, Services Engineer
— Nima Khoshini, Services Team Lead