gRPC Adoption in GIPHY

December 11, 2018 by Nima Khoshini

GIPHY’s API servers handle over a million requests a minute. To handle this load, we divide the work between a variety of microservices, each of which fulfills part of the request. For example, one service performs searches, another supplies metadata about GIFs, yet another validates user credentials, and so on. Our first iteration of microservices was initialized as standard REST servers with JSON payloads, but more recently we started to experiment with gRPC. We’ve seen improvements across the board in terms of performance, ease of development and cost.


What is gRPC

gRPC is an open source Remote Procedure Call (RPC) library that is well supported in numerous languages, such as Python, Java, PHP, etc. It is based on Google’s internal RPC infrastructure, entitled Stubby, but rewritten to be more generic for public usage. Unlike the REST/JSON paradigm, in which a contract between the client and server is defined in the documentation, gRPC enforces strict contracts using an Interface Description Language (IDL). Within each language, gRPC’s protoc binary generates stub code for creating a client or a server, thereby making the integration drop-dead simple.

RPC is not a new concept by any means. It has existed for over 20 years in different formats (Corba, Soap, Java RMI, etc). gRPC, however, is perhaps the most elegant and simplest implementation. It also leverages popular technologies under the hood, such as Protobuf for fast object serialization/deserialization, and HTTP/2 for persistent connections.


Migrating to Protobuf

Protocol Buffers, or Protobuf for short, is Google’s popular object serialization library that serves as the foundation for defining all gRPC objects and services. Some of our services already used Protobuf for incoming request payloads and outgoing response objects, making the transition straightforward. The remaining services had to be retrofitted to use Protobuf as a means of communication within the services.

Most of our services are written in Scala; fortunately for us, we were able to leverage the ScalaPB library to facilitate gRPC integration. First, we mapped our REST endpoints into corresponding gRPC services. Along the way, we organized logical groups of functionality (say managing a User’s profile), into distinct services. Then, we implemented the generated server interfaces in our code.


For example, suppose we had this service for profile management:

service UserProfile {

      rpc CreateProfile (CreateProfileRequest) returns (UserResponse) {}

      rpc UpdateProfile (UpdateProfileRequest) returns (UserResponse) {}



Implementing the server-side portion of the generated ScalaPB trait would look something like:

// Base trait for an asynchronous client and server.

    object UserProfile extends UserProfileGrpc.UserProfile {

      def updateProfile(request: UpdateProfileRequest): Future[UpdateProfileRequest] = {

       // call update profile code and return a future


      def createProfile(request: CreateProfileRequest): Future[CreateProfileRequest] = {

     // call create profile code and return a future




Since most of our code was structured around returning a Future, the changes in the existing codebase were trivial.

The last step for us was to wire up the client portion of the codebase. Turns out, this is equally a simple to set up as well:

    val channel = ManagedChannelBuilder.forTarget("localhost:5000").build()

    val client = UserProfileGrpc.stub(channel)



As you can see, it’s only a couple of lines to generate a client and server. The simplicity of integration alone makes it much easier to create and consume a service with gRPC instead of REST/JSON.


Deploying to Kubernetes

At GIPHY we use Kubernetes for pretty much every user-facing and internal deployment. Integrating gRPC services into Kubernetes was a very easy process at first. However, as we’ll show below, there are several huge differences worth noting between gRPC and REST deployments.

Client-Side Load Balancing


Anyone that has deployed a gRPC services onto Kubernetes will know that doing proper load balancing is not as straightforward as with a RESTful service. Due to gRPC’s usage of HTTP/2, which maintains long-lived connections, a service will never properly load balance between different pods using the same techniques usually used with a RESTful service. However, there are several other load balancing techniques that work well for gRPC, either at the infrastructural level or code level.

We chose to tackle this problem using client-side load balancing for now. There are a couple examples on Github for Java-based services to implement what is known as a NameResolverProvider. We created a NameResolver that watches the Kubernetes APs for changes to a deployment. Watching the Kubernetes API means that when new pods are added or removed, the client almost instantly knows about it and can adjust calls accordingly. The sample code we used as inspiration can be found here.

While this approach is nifty, it’s unfortunately language specific. Leveraging a service mesh like Linkerd or an Envoy proxy would go much further in addressing our concerns. If you’re looking for more info on load balancing gRPC with Kubernetes, you can head here.


Health Checks


One of the problems we faced when deploying these new services to Kubernetes was the lack of the simple health checks we previously had with the REST services. Since gRPC uses Protobuf to communicate, there is no simple way of adding a simple liveness or readiness probe within our Kubernetes deployment to check the health. There are however, some libraries present that formalize and facilitate the creation of health check services.

We implemented grpc-health-probe as outlined in this blog post.


The integration of health checks required us to:

– Update our gRPC servers to implement the generic Health service.

– Implement the health checks on the server. The checks are unique to each service. For example, one service was validating results from a SQL call while another was simply performing a DNS host name check.

– Update our Kubernetes manifests to make a binary call to grpc_health_probe.

Prior to implementing health checks, we would occasionally see spikes in latency for some deployments. We tracked this down to Kubernetes draining a node and thus killing a gRPC pod, which was brought up in another node. After implementing the Health checks, we were happy to see consistent response times and no more latency spikes.


Deadlines and Circuit Breakers


Timeouts are important for both a client and a server. As a client, you do not want to wait indefinitely to hear back from the server.

The gRPC client libraries have a built in notion of deadlines, making timeouts quite simple to enable. In our Scala code, we made sure every client call specified a deadline by using the withDeadlineAfter method. This alleviated client calls waiting around forever.

Unfortunately in our testing, enabling a client call deadline did not kill the request on the server and subsequently free up all of its resources. In other words, if a client makes an expensive call to the server (think of a slow DB call), it will give up after its deadline interval is surpassed, but the server may sit around waiting indefinitely until it generates a response.

To address this, we introduced some very lightweight circuit-breaking logic to timeout requests after a specified interval. Additionally we specified a backoff interval period to fail requests quickly should the server be under stress. This helped us ensure that in disaster scenarios, in which communication with a shared resource is unable to be resolved, we can reliably fail requests quickly while maintaining consistent levels of CPU and memory usage.

There are numerous Circuit Breaker libraries out there; Netflix’s Hystrix is the most popular. The Akka library has a very clean and simple implementation for Scala microservices as well.



As you can see, migrating from REST to gRPC is more involved than simply swapping out libraries. However, the migration is worthwhile: we have noticed gRPC deployments using up to 10x less CPU than their REST counterparts. This can be attributed to the use of Protobuf’s across the board and lack of JSON serialization/deserialization. With the savings in CPU usage, our service deployments can handle higher loads with fewer replicas. That, plus not having to write boilerplate integration code in consuming services, makes it a winning strategy. No doubt about it, gRPC makes writing microservices fun, fast, scalable and easy.

— Nima Khoshini, Services Team Lead