Exercise 1.27

A hash table is a commonly used data structure in computer science, allowing for fast information retrieval. For example, suppose we want to store some people’s phone num- bers. Assume that no two of the people have the same name. For each name x, a hash function h is used, letting h(x) be the location that will be used to store x’s phone number. After such a table has been computed, to look up x’s phone number one just recomputes h(x) and then looks up what is stored in that location.

The hash function h is deterministic, since we don’t want to get different results every time we compute h(x). But h is often chosen to be pseudorandom. For this problem, assume that true randomness is used. Let there be k people, with each person’s phone number stored in a random location (with equal probabilities for each location, inde- pendently of where the other people’s numbers are stored), represented by an integer between 1 and n. Find the probability that at least one location has more than one phone number stored there.

Answers

For each of the k names, we sample a memory location from 1 to n with equal probability, with replacement. This is exactly the setup of the birthday problem. Hence, the probability that at least one memory location has more than 1 value is

P(A) = 1 P(Ac) = 1 n(n 1)(n k + 1) nk

Also, P(A) = 1 if n < k.

User profile picture
2021-12-05 00:00
Comments