Chapter 11

Exercise 11.1-1

Available in the latest revision of the IM.

Exercise 11.1-2

Available in the latest revision of the IM.

In java, the class BitSet implements a vector of bits that grows as needed. In C++, the class template bitset represents a fixed-size sequence of bits.

Exercise 11.1-3

Available in the latest revision of the IM.

If any element among those with an equal key suffices to serve as a return value from Search, then there is no need to keep them all. Hence, we can avoid the usage of doubly linked lists, as in the IM, by working with (x,count)(x,count) ordered pairs, where xx is an object or pointer to an object. So, each slot is either nil (None in Python) or an ordered pair. Insertions increase, while deletions decrease those counters.

Exercise 11.1-4

Available in the latest revision of the IM.

Exercise 11.2-1

Available in the latest revision of the IM.

Exercise 11.2-2

We start with an empty table T[0:8]T[0:8]. Elements which appear to the left in the table have been added later.

Slot
Content

0

\empty

1

10, 19, 28

2

20

3

12

4

\empty

5

5

6

33, 15

7

\empty

8

17

Exercise 11.2-3

Available in the latest revision of the IM.

Exercise 11.2-4

Available in the latest revision of the IM.

Exercise 11.2-5

Available in the latest revision of the IM.

Exercise 11.2-6

Available in the latest revision of the IM.

Exercise 11.3-1

Available in the latest revision of the IM.

Exercise 11.3-2

Available in the latest revision of the IM.

Exercise 11.3-3

Available in the latest revision of the IM.

Exercise 11.3-4

61 → 700 62 → 318 63 → 936 64 → 554 65 → 172

Exercise 11.3-5

Available in the latest revision of the IM.

Exercise 11.3-6

The size of our family of hash functions is Zp=p|\Z_p|=p, as we have pp choices to select hbh_b. We need to calculate the probability of a collision hb(k1)=hb(k2)h_b(k_1)=h_b(k_2) for distinct keys k1k_1 and k2k_2. This probability is over the picks of hbh_b. Observe that

0hb(k1),hb(k2)<p    p<hb(k1)hb(k2)<p0 \le h_b(k_1),h_b(k_2)<p \implies -p < h_b(k_1)-h_b(k_2)<p

which gives

hb(k1)hb(k2)=0 ( ⁣ ⁣ ⁣ ⁣ ⁣ ⁣modp)    hb(k1)=hb(k2)h_b(k_1)-h_b(k_2)=0\space(\!\!\!\!\!\!\mod p) \implies h_b(k_1)=h_b(k_2)

We have the following equalities all taken modulo pp

hb(k1)hb(k2)=j=0d1k1jbjj=0d1k2jbj=j=0d1(k1jk2j)bjh_b(k_1)-h_b(k_2)=\sum_{j=0}^{d-1} k_{1j}b^j-\sum_{j=0}^{d-1} k_{2j}b^j=\sum_{j=0}^{d-1} (k_{1j}-k_{2j})b^j

Following the hint from the book, according to Exercise 31.4-4 the above polynomial can have at most d1d-1 distinct zeros modulo pp. In other words, at most d1d-1 choices of bb can lead to a collision. Therefore, our family of hash functions is ϵ\epsilon-universal for ϵ=(d1)/p\epsilon = (d-1)/p.

Exercise 11.4-1

Use VisuAlgo and select linear probing in the top menu bar. Choose the option in the left sidebar to create an empty hash table with m=11m=11 slots and n=0n=0 elements and press Go. Afterward, select the command to insert elements 10,22,31,4,15,28,17,88,59 and press Go.

For double hashing, the above website uses a different auxiliary hash function h2h_2, so we cannot use it here. The hash table is filled as follows:

0
1
2
3
4
5
6
7
8
9
10
10
22
10
22
31
10
22
4
31
10
22
4
15
31
10
22
4
15
28
31
10
22
17
4
15
28
31
10
22
17
4
15
28
88
31
10
22
59
17
4
15
28
88
31
10

Exercise 11.4-2

Hash-Delete expects the key to be present in the hash table. Otherwise, there should be another check against nil for the returned index.

Hash-Delete(T, k)
    i = Hash-Search(T, k)
    T[i] = DELETED

The Hash-Insert procedure from the book requires a slight change in 4 line as follows:

if T[q] == nil or T[q] == DELETED

The Hash-Search doesn't need any change.

Exercise 11.4-3

Available in the latest revision of the IM.

Exercise 11.4-4

Instead of doing integral approximation (see page 300) we simply use the definition of the mmth harmonic humber.

1αk=mn+1m1k=k=1m1k=Hm\frac{1}{\alpha}\sum_{k=m-n+1}^m \frac{1}{k}=\sum_{k=1}^m \frac{1}{k}=H_m

Exercise 11.4-5

This directly follows from Theorem 31.20, since h2(k)=m/d|\langle h_2(k)\rangle|=m/d.

Exercise 11.4-6

Enter solve 1/(1-x)=2/x ln(1/(1-x)) in WolframAlpha to get α0.71533\alpha \approx 0.71533 by clicking on the Approximate form button inside the Result pane.

Exercise 11.5-1

fa(0)(k)f_a^{(0)}(k) is an identity function, which is obviously injective. If we prove that fa=fa(1)(k)f_a=f_a^{(1)}(k) is injective (base case), then fa(r)(k)=fafa(r1)(k)f_a^{(r)}(k)=f_a \circ f_a^{(r-1)}(k) for r>1r>1 is also injective by induction on rr. Note that the composition of two bijections is again a bijection (see solution for Exercise B.3-4).

We know that the swapswap function only rearranges the bits of an input independently of it's value, therefore x=y    swap(x)=swap(y)x=y \iff swap(x)=swap(y).

Suppose, for the sake of contradiction, that there are two distinct inputs xx and yy such that fa(x)=fa(y)f_a(x) =f_a(y). WLOG assume that x>yx>y, otherwise, exchange them. Thus, we have

2x2+ax=2y2+ay ( ⁣ ⁣ ⁣ ⁣ ⁣ ⁣mod2ω)    2ω(2x2+ax)(2y2+ay)    2ω2(xy)(x+y)+a(xy)    2ω(xy)(2(x+y)+a)(contradiction!)\begin{align*} &2x^2+ax = 2y^2+ay\space(\!\!\!\!\!\!\mod 2^{\omega}) \\ &\implies 2^{\omega} | (2x^2+ax) - (2y^2+ay) \\ &\implies 2^{\omega} | 2(x-y)(x+y)+a(x-y) \\ &\implies 2^{\omega} | (x-y)(2(x+y)+a) && \text{(contradiction!)} \end{align*}

We have reached a contradiction, since aa is odd and 2ω(xy)2^{\omega} ∤ (x-y). Recall that both xx and yy are ω\omega-bit words, hence xy<2ωx-y < 2^{\omega}. Consequently, faf_a is injective.

Exercise 11.5-2

A random oracle is equivalent to an independent uniform hash function family. Let us take 5 distinct keys k1,k2,k3,k4,k5k_1,k_2,k_3,k_4,k_5. Each will be independently and uniformly mapped by a random oracle to a slot in the range [0,m)[0,m). There are m5m^5 possible mappings, each with an equal probability of occurrence 1/m51/m^5. Thus, a random oracle is 5-independent according to the definition from the book on page 288.

Since a random oracle guarantees independence for any number of distinct inputs, it inherently satisfies 5-independence. 5-independence is a weaker condition than full independence; a fully independent function trivially satisfies kk-independence for all kk.

Exercise 11.5-3

At least r=3r=3 rounds are required to trigger an avalanche effect potentially affecting any bit of the output to flip. In the illustration below, light red colored cells could be affected by flips in previous rounds. After 3 rounds, all cells are marked red, hence may be impacted by flipping a single bit of the input value kk. Observe that flipping the most-significant bit of kk only impacts the most-significant bit of ga(k)g_a(k) in the first round. Any other bit of kk can only make the situation better. Hence, we consider a flip in bit position ω1\omega-1 as the worst-case scenario. Note that round 0 is an identity function, so it is essentially the key itself. Each round consists of two parts: the upper is ga(k)g_a(k) and the lower is fa(k)f_a(k) of the corresponding round.

Drawing

Problem 11-1

Available in the latest revision of the IM.

Problem 11-2

Available in the latest revision of the IM.

Problem 11-3

Available in the latest revision of the IM.

Problem 11-4

a.

Arbitrarily pick two distinct keys x,yUx,y \in U. If H\mathcal{H} is 2-independent, then we can have m2m^2 different pairs (h(x),h(y))(h(x),h(y)) of hash values with equal probability of occurrence 1/m21/m^2. Collisions are denoted by pairs (i,i)(i,i) for i=0,1,,m1i=0,1,\dots,m-1. Therefore, the probability of a collision for two distinct keys is m/m2=1/mm/m^2=1/m. Consequently, H\mathcal{H} is universal.

b.

First, we show that is H\mathcal{H} universal. Arbitrarily pick two distinct keys x,yUx,y \in U. They must differ in at least one component. WLOG let it happen at index 0i<n0 \le i < n. For a collision to occur, with a randomly picked hash function haHh_a \in \mathcal{H}, we must have ha(x)=ha(y)h_a(x)=h_a(y). Using the definition of hah_a we get

j=0n1ajxj=j=0n1ajyj ( ⁣ ⁣ ⁣ ⁣ ⁣ ⁣modp)    j=0n1aj(xjyj)=0 ( ⁣ ⁣ ⁣ ⁣ ⁣ ⁣modp)    ai(xiyi)=0j<nijaj(xjyj) ( ⁣ ⁣ ⁣ ⁣ ⁣ ⁣modp)\begin{align*} &\sum_{j=0}^{n-1} a_jx_j = \sum_{j=0}^{n-1} a_jy_j \space(\!\!\!\!\!\!\mod p) \\ &\implies \sum_{j=0}^{n-1} a_j(x_j-y_j)=0 \space(\!\!\!\!\!\!\mod p) \\ &\implies a_i(x_i-y_i)=-\sum_{\substack{0\le j<n\\i \neq j}} a_j(x_j-y_j) \space(\!\!\!\!\!\!\mod p) \\ \end{align*}

Observe that (p1)xiyip1-(p-1) \le x_i-y_i \le p-1 and xiyi0x_i-y_i \neq 0 implies that xiyix_i-y_i has a multiplicative inverse modulo pp. Therefore,

ai=(xiyi)10j<nijaj(xjyj)\begin{equation} a_i=-(x_i-y_i)^{-1}\sum_{\substack{0\le j<n\\i \neq j}} a_j(x_j-y_j) \end{equation}

Notice that aia_i is fully determined based on the choice of xx, yy and other components of aa. Thus, the probability of a collision is 1/p1/p, since for a randomly selected nn-tuple aa, there are pp possible values at index ii and only one leads to a collision. Furthermore, if ha(x)=ha(y)h_a(x)=h_a(y) then all coefficients aka_k, associated with corresponding nonzero terms xkykx_k-y_k, simultaneously satisfy (1) (just set i=k)i=k). This shows that H\mathcal{H} is universal.

Following the hint from the book, it is easy to see that x0=0,0,,0x_0=\langle0,0,\dots,0\rangle always hashes to zero. Therefore, the condition to have p2p^2 unique pairs of hash values, with equal probability of occurrence, for any two distinct keys is not satisfied. Consequently, H\mathcal{H} is not 2-independent.

c.

Taking the hint from the book, let's see what happens to hab(x)h_{ab}^{'}(x) as aia_i and bb range over Zp\Z_p. The same reasoning will also apply to hab(y)h_{ab}^{'}(y). Let c=0j<nijajxjc=\sum_{\substack{0\le j<n\\i \neq j}} a_jx_j, since we keep all other parameters fixed. This constant cc has no influence on the subsequent analysis.

We must show that hab(x)h_{ab}^{'}(x) can take any value from Zp\Z_p and equal number of distinct pairs (ai,b)(a_i,b) map to each such output. There are two cases to consider:

  • If xi=0x_i=0 then each value of bb produces a unique hash output, so it may range over Zp\Z_p. For every bb we can freely assign pp possible values to aia_i. In other words, pp distinct pairs map to each distinct hash value.

  • Otherwise, according to theorem 31.20, we know that xi=p|\langle x_i \rangle|=p, since xix_i is relatively prime to pp. In this case, the parameter bb only shifts the full sequence of pp values generated by xix_i. There are pp such shifts. Thus, the output may range over Zp\Z_p and again pp distinct pairs (ai,b)(a_i,b) map to each distinct hash value.

We can conclude that H\mathcal{H}^{'} is 2-independent.

d.

Since H\mathcal{H} is 2-independent, we can have p2p^2 different pairs of hash values, with equal probability of occurrence 1/p21/p^2, for two distinct keys x,yUx,y \in U. The adversary may fool Bob, assuming that mmm \neq m', if h(m)=h(m)=th'(m')=h(m')=t', where hh denotes the shared hash function between Alice and Bob. All the adversary can do is to find a hash function hh' such that h(m)=h(m)=th'(m)=h(m)=t, hoping that it will also collide on mm' with hh. There are pp pairs of possible hash values whose first component is tt. The probability that hh' will also generate the same second component as hh for mm' is 1/p1/p. Therefore, no matter how much computing power the adversary has, her/his probability of success is at most 1/p1/p.

As a side note, the previous analysis may also be carried out in an opposite direction. Namely, to ask what is the probability that Alice and Bob had selected a random hash function that hashes to the same value as hh' for mm'. Among those that produce tt for mm only (1/p)(1/p)th generate tt' for mm'. So, we arrive at the same conclusion.

Last updated