Homepage › Solution manuals › Joseph Blitzstein › Introduction to Probability › Exercise 1.27
Exercise 1.27
A hash table is a commonly used data structure in computer science, allowing for fast information retrieval. For example, suppose we want to store some people’s phone num- bers. Assume that no two of the people have the same name. For each name x, a hash function h is used, letting h(x) be the location that will be used to store x’s phone number. After such a table has been computed, to look up x’s phone number one just recomputes h(x) and then looks up what is stored in that location.
The hash function h is deterministic, since we don’t want to get different results every time we compute h(x). But h is often chosen to be pseudorandom. For this problem, assume that true randomness is used. Let there be k people, with each person’s phone number stored in a random location (with equal probabilities for each location, inde- pendently of where the other people’s numbers are stored), represented by an integer between 1 and n. Find the probability that at least one location has more than one phone number stored there.
Answers
For each of the names, we sample a memory location from to with equal probability, with replacement. This is exactly the setup of the birthday problem. Hence, the probability that at least one memory location has more than value is
Also, if .