64,220 bytes of Chick-fil-A menu (serialized JSON) was compressed down to 10,199 bytes with LZ4, and 11,414 bytes with Snappy.So without going into detail and repeating the same benchmarks, here are some examples and a summary of our findings: There are a plethora of benchmarks on the internet already comparing compression speeds and ratios. Both libraries were in the Goldilocks zone of compression/decompression speed, CPU usage, and compression ratio. Due to the specific nature of our endpoint, we found LZ4 and Snappy were more favorable. We also considered other famous options like Zlib, Zstandard, and Brotli but found their decompression speeds (and CPU load) were not ideal for our scenario. Like other folks we did our benchmarks, and found that LZ4, and Snappy were two nice options. use a compression algorithm, with good speed and a decent compression ratio. Similarly, our friends at CloudFlare also used a similar technique to squeeze more speed out of Kafka. We were well aware of techniques like LevelDB using snappy to compress, and decrease the on-disk size. To fix this issue, we obviously wanted to reduce the amount of traffic between our server nodes and cache. Why this happens should be a surprise to no one, reading or writing many large payloads over the network during peak hours can end up causing network congestion and delays. This was especially true when a restaurant or a chain with really large menus were running promotions. During peak hours, reads from Redis took more sometimes at random took more than 100ms. Our instrumentation showed us that reading these large values repeatedly during peak hours was one of few reasons for high p99 latency. By the end of this post we will present how we used compression to not only improve our latency, but also to get ourselves more space to cache.Īfter some deep instrumentation and inspection we determined the problem in this particular scenario was that some of our menus were almost half a MB long. We cache our serialized menus in Redis to avoid repeated calls to the data base, and spread out the read traffic load. Since it’s a high traffic endpoint we naturally use caching pretty intensively. One of our endpoints that serves restaurant menus to our consumers had high p99 latency numbers. While the problem sounds simple on the surface, it gets interesting sometimes. One of challenges we face almost everyday is to keep our API latency low.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |