Monday 1 December 2014

Redis pipeline explained

So, I had made some conceptual assumptions in my previous post, which were incorrect in practice according to some feedback I got from the redis google group. So here's another post which attempts to be closer to the actual implementation of Redis pipelining.

Here's the thing: Pipeline is a purely a client side implementation. Redis has very less to do with pipelining. 

Now, that makes it difficult to generalize as this client side implementation may differ among client libraries. Please have a look at this post for why pipelining. So, here's how it could go about when operations are pipelined by a client:
  • Buffer the redis commands/operations on the client side
  • Synchronously or asynchronously flush the buffer periodically depending on the client library implementation
  • Redis executes these operations as soon as they are received at the server side
  • Subsequent redis commands are sent without waiting for the response of the previous commands. Meanwhile, the client is generally programmed to return a constant string for every operation in the sequence as an immediate response
  • The tricky part: the final responses. Very often it is wrongly interpreted that all the responses are always read at one shot and that the responses maybe completely buffered on server's side. Even though the response to all the operations seem to arrive at one shot when the client closes the pipeline, it is actually partially buffered on both client and the server. Again, this depends on the client implementation. There does exist a possibility that the client reads the buffered response periodically to avoid a huge accumulation on the server side. But it is also possible that it doesn't. For example: the current implementation of the Jedis client library reads all responses of a pipeline sequence at once. 
What this means and how it needs to be interpreted in practice:
  • Pipelining is super fast and is a good option for multiple subsequent client-server interactions in a high latency environment.
  • Pipelining does not guarantee atomicity. That is the job of a transaction.
  • Pipelining is not suitable if some redis operations are to be performed depending on the response of preceding operations. 
  • The performance of pipelining depends on the client library being used.
  • There must be an reasonable upper limit to the number of operations in a pipeline sequence, as the server's memory may run out by buffering too many responses.
  • Issues with pipelining might be observed even in low latency environment because of bandwidth issues like low MTU (Maximum Transmission Unit).
Now, to why this approach might have been faster can be answered from the last point above. The approach exploited the usage of the bandwidth and didn't do much to cut down on latency. 

Here are some links I read through to come at these conclusions:
  • https://groups.google.com/forum/m/#!topic/redis-db/3kRJdugPTNM
  • https://github.com/xetorthio/jedis/issues/229
  • https://groups.google.com/forum/#!topic/redis-db/BMe1uTOZbpc
  • http://redis.io/topics/pipelining
  • https://github.com/xetorthio/jedis/pull/349

3 comments: