Interesting. Some points worth mentioning about optimization: 1. The design uses

Interesting. Some points worth mentioning about optimization: 1. The design uses a server-side routing between shards. Instead, you could take a leaf from Google's own file storage mechanism and push "which branch is at which shard" to the client (using logical names of course, not IP addresses). That would save you one critical hop on a cluster that is going to be super heavy (receive all calls and forward them). It's important to note that a user starting to type a string will always get sent to the same shard once he's past the common prefix - so we'll really be saving an intra datacenter call for each query. You could do this after-the-fact - meaning keeping the cluster up but letting the client see where the response came from so the next iteration could be sent to the correct shard, or pre-fetch that information and make sure the client is notified on change.

Click here to start solving coding interview questions