Contextually-Aware Search: GIPHY Gets Work-Specific

September 4, 2018 by Zachary Hay

GIPHY has popular integrations on many platforms like Facebook Messenger, Twitter, and Slack. We serve tens of billions of requests a month to all integrations via our API. Users search our industry-leading database of GIFs and stickers to find the perfect one to send in all of these messaging platforms.

GIPHY’s search engine calculates CTR rates via giphy.com and mobile apps. We are optimizing the search engine for these contexts. I found that users’ content preferences on Slack differ significantly than the preferences of mobile and website users, meaning that the best GIFs for Slack were not always in the top of search results. To solve this problem, I introduced a contextually-aware search algorithm for our Slack integration. This resulted in the significant reordering of many search results, and a large relative increase of 7.4% in the overall send rate of users in the Slack integration.

 

THE GIPHY SLACK INTEGRATION

GIPHY’s Slack integration lets users send GIFs and stickers to coworkers using the translate endpoint in our API. The integration has two modes: shuffle and random. Sending the “/giphy query” command in shuffle mode starts a search session where users shuffle through a list of GIFs until they find the perfect one to send. My project revolved around calculating the send rates for specific GIFs using click-actions from shuffle mode more on that later). The random mode is a bit more exciting. Users enter the slack command and a GIF is immediately sent from a pool ofrelevant GIFs.

On GIPHY.com or the mobile app, users can scroll through pages of GIF search results to make a selection. However, our Slack integration presents a unique search context since users are only presented with a single GIF at a time. This poses a challenge because we have to inject a bit of randomness in the mix to make things novel and fun while keeping the results relevant.

Two ways of celebrating in Slack at work

               Shuffle Mode                                                               Random Mode

                                   

 

 

Celebrating on Giphy.com

Many GIFs – One Page

 

CONTEXTUALLY-AWARE SLACK SEARCH

GIPHY engineers previously implemented a creative solution around the constraints of the Slack integration called the translate endpoint. The previous release of the translate endpoint selected a GIF from the top 25 search results for a query, with a larger probability of selecting a GIF with a higher ranking. This way, users received a variety of relevant GIFs, instead of the same one or two ad nauseum. The goal of contextually-aware search was to release a new version of the translate endpoint featuring Slack-specific search weights. To do this, we utilized click-action data collected from our Slack integration to calculate the send rate of a GIF with a given query.

I wrote a Spark job to calculate the send rate of a GIF with its query over an arbitrary period of time. By calculating the send rate over several months, there’s enough example data (query/ GIF/ user action) to make an informed estimate of the send rate over an enormous number of searches.

What is the difference between these two GIFs?

                                                

The angry doctor GIF on the left is the highest-ranked GIF in our database for the query “monday” (according to multiple factors, including CTR) while the escalator GIF on the right is ranked 16th. Their Slack send rates, however, vary greatly. The escalator GIF is sent 46% of the times it’s seen, while only 25% for angry doctor GIF. Prior to my project, the Slack Integration would be five times more likely to return the angry doctor GIFthan the elevator GIF. This large  discrepancy affected hundreds of other popular queries like “mad,” “sweet,” and “thanks.” Clearly, Slack users have different content preferences than users of our other products, and we had a great case for creating Slack-specific search scores in the integration.

Our next step was to update the translate endpoint to utilize the powerful send rate data we calculated for a given GIF/query pair. We decided to build on top of the base logic of the previous version of the translate endpoint, but expanded the pool of GIFs from 25 to 50. Now, GIPHY’s search engine returns the top 50 GIFs for a search, then the translate endpoint orders them according to their Slack-specific send rate.

My blog post would not be complete without revealing all new features of the translate endpoint available to developers here in addition to the Slack-specific search score detailed above:

– No duplicates in shuffle mode. We now keep shuffle session state so the same GIF will not be returned within 10 shuffles.
– Weirdness as you shuffle. The start of a shuffle session returns highly relevant GIFs; they get weirder (or more random) as you shuffle.
– A weirdness param can be passed to the translate endpoint. It determines how weird (or random) the GIFs we return will be.
– New GIFs or stickers added to GIPHY’s database that Slack users enjoy will quickly “bubble up” to the top of the search results due to continuously recalculating send rate data.

We implemented all of these features into a compact mircroservice with a queryable API that could handle Slack’s daily throughput. A Redis datastore was used to hold all send rate information. Below is a diagram laying out the architecture and flow of a request:

          

 

MEASURING IMPACT

There were three Key Performance Indicators (KPIs) established for measuring success of the revamped Slack integration: shuffle mode session length; overall send rate; and the “one GIF, one submit” rate of shuffle mode sessions. The “one GIF, one submit” rate is important because it is a proxy for the quality of random mode GIFs. We can’t directly measure random mode sessions because the first GIF is always returned. There is no notion of shuffling, canceling, or sending. A high “one GIF, one submit” rate in shuffle mode, though, indicates that users are getting a good GIF in random mode because shuffle mode and random mode use the same underlying process.

After deploying Slack-specific search for a week, we saw dramatic improvements in all three metrics. These increases are calculated across all queries, not just a high performing subset. Furthermore, A higher proportion of sessions ended in 0-1 shuffle, which means shuffle mode users aren’t having to work as hard to get the GIFs they want. Contextually aware search had a huge impact on our KPI!

 

HOURLY ONE GIF, ONE SEND RATE (AUG 13-AUG 21)

 

MY SUMMER AT GIPHY

I worked on this project as a Search Engineering Intern on the R&D team. The R&D team works on a variety of long-range projects to improve search from content moderation to GIF recommendations using machine learning and adaptive systems. We are tasked with introducing new, cutting edge technology and forward thinking products.

I want give a big shout out to Nick Hasty, my PM and mentor for the summer. He introduced me to the GIPHY tech stack and gave me support throughout the project from product vision to productionalizing the code. Ihor Kroosh and Denis Sergienko from Rails Reactor were both essential members of the team. Their expertise in Luigi scheduling, Kubernetes, and deployment was invaluable throughout the process. Anthony Johnson, Sixuan Liu, and Sean Quigley all took time to ensure the project’s success even though they are not members of the R&D team.

Thank you all!

– Zach Hay, Search Engineering Intern