# Overview In distributed systems, data will often need to be divided across hardware to allow for [[Horizontal Scaling]]. The division of this data is based on a selected [[Partitioning#Partitioning Strategies]], which aims at distributing data in a manner that is either uniform, or optimizes processing in some way (i.e., ensuring related data is stored together). The Hot Key (or Celebrity) Problem occurs when one of the divided segments of this data is accessed much more often than others, thus putting strain on the hardware storing that set of data and negating the benefits of the horizontal scaling. As an example, imagine storing Instagram pictures. The segment of data storing Cristiano Ronaldo's data would get much more traffic than Ryan Lynch's. # Key Considerations ## Potential Solutions for Hot Key or Celebrity Problem #flashcard - **Partition Purely Based on Creating a Uniform Distribution** - if order of the data does not matter, the partitions can use a hash function that aims on uniform distribution of the data across nodes - **Random [[Salting]]** - append a random number or timestamp onto the end of the data used for the partitioning. This can help in distributing the load more evenly across multiple partitions, though it may complicate logic later on the consumer side. - **Use a Compound Key** - concatenate multiple fields together for the partition, so the common data is still grouped together, but may end up on different nodes based on the concatenated potions of the partition key. - [[Backpressure]] - work with the producer to slow down incoming data into that particular partition. This does not work for all use cases. - Redundant caches - use multiple caches that aren’t coordinated in anyway. The other services can randomly select a cache. This increases load on the database but will spread the load across caches <!--ID: 1751507776417--> # Pros # Cons # Use Cases # Related Topics