Traditional Hashing - Ryan Lynch's Hub

# Overview Traditional hashing is a basic approach for distributing a piece of data across N nodes (i.e. machines). [[Hashing]] is an approach that takes in data and spits out a corresponding value through some sort of algorithm. The provided value is always the same if the data entered into the algorithm is the same. In traditional hashing, the formula `hash(data) % num_nodes` is used to assign a node to a piece of data using the [[modulo operator]]. Of course, in order to achieve a uniform distribution across nodes, the hash function will need to be selected carefully to distribute values evenly. # Key Considerations While this approach works well for a static number of nodes, if there is a change the value of `num_nodes` changes and data would need to be redistributed. [[Consistent Hashing]] is an approach that aims to resolve this issue. # Pros # Cons - If the number of buckets changes (node added or removed), then a lot of values will be assigned new nodes. In a distributed system like a database, this would mean that data would have to move between nodes in excess. - If you're unlucky with your hashing scheme, there might be a lot of values that get hashed to the same node, resulting in uneven load between nodes. # Use Cases # Related Topics