# Chapter 11

## Exercise 11.1-1

**Available in the latest revision of the IM.**

## Exercise 11.1-2

**Available in the latest revision of the IM.**

{% hint style="info" %}
In java, the class [BitSet](https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html) implements a vector of bits that grows as needed. In C++, the class template [bitset](https://en.cppreference.com/w/cpp/utility/bitset.html) represents a fixed-size sequence of bits.
{% endhint %}

## Exercise 11.1-3

**Available in the latest revision of the IM.**

{% hint style="info" %}
If any element among those with an equal key suffices to serve as a return value from `Search`, then there is no need to keep them all. Hence, we can avoid the usage of doubly linked lists, as in the IM, by working with $$(x,count)$$ ordered pairs, where $$x$$ is an object or pointer to an object. So, each slot is either `nil` (`None` in Python) or an ordered pair. Insertions increase, while deletions decrease those counters.
{% endhint %}

## Exercise 11.1-4

**Available in the latest revision of the IM.**

{% hint style="warning" %}
Setting `T[k]=nil` is redundant (see the sequence of assignments for deletion in the IM).
{% endhint %}

## Exercise 11.2-1

**Available in the latest revision of the IM.**

## Exercise 11.2-2

We start with an empty table $$T\[0:8]$$. Elements which appear to the left in the table have been added later.

<table><thead><tr><th width="89.196533203125" align="center">Slot</th><th width="105.75">Content</th></tr></thead><tbody><tr><td align="center">0</td><td><span class="math">\empty</span></td></tr><tr><td align="center">1</td><td>10, 19, 28</td></tr><tr><td align="center">2</td><td>20</td></tr><tr><td align="center">3</td><td>12</td></tr><tr><td align="center">4</td><td><span class="math">\empty</span></td></tr><tr><td align="center">5</td><td>5</td></tr><tr><td align="center">6</td><td>33, 15</td></tr><tr><td align="center">7</td><td><span class="math">\empty</span></td></tr><tr><td align="center">8</td><td>17</td></tr></tbody></table>

## Exercise 11.2-3

**Available in the latest revision of the IM.**

## Exercise 11.2-4

**Available in the latest revision of the IM.**

## Exercise 11.2-5

**Available in the latest revision of the IM.**

## Exercise 11.2-6

**Available in the latest revision of the IM.**

## Exercise 11.3-1

**Available in the latest revision of the IM.**

## Exercise 11.3-2

**Available in the latest revision of the IM.**

{% hint style="danger" %}
The solution in the IM doesn't take into account the fact that the key is encoded as a radix-128 number. So, the correct way to calculate the hash value is to apply Horner's rule and compute it iteratively. Let's assume that the key $$k$$ is represented as a sequence of characters $$\langle s\_1, s\_2, \dots, s\_r \rangle$$ in big-endian format.

```
// Snippet of pseudocode to calculate h(k)
h = 0
for i = 1 to r:
    // Multiplying by 128 is shifting left by 7 positions.
    h = ((h << 7) + s[i]) mod m
```

{% endhint %}

## Exercise 11.3-3

**Available in the latest revision of the IM.**

{% hint style="warning" %}
The solution in the IM is incomplete, since it doesn't answer the last question from this exercise.  A cryptographic hash function must ensure that any permutation of characters generate a different hash value. Otherwise, someone could rearrange a document without breaking a message digest used in a digital signature. Imagine an adversary being able to rearrange the bank account number and/or amount of a digitally signed transaction.
{% endhint %}

## Exercise 11.3-4

61  → 700\
62 → 318\
63 → 936\
64 → 554\
65 → 172

## Exercise 11.3-5

**Available in the latest revision of the IM.**

## Exercise 11.3-6

The size of our family of hash functions is $$|\Z\_p|=p$$, as we have $$p$$ choices to select $$h\_b$$. We need to calculate the probability of a collision $$h\_b(k\_1)=h\_b(k\_2)$$ for distinct keys $$k\_1$$ and $$k\_2$$. This probability is over the picks of $$h\_b$$. Observe that

$$
0 \le h\_b(k\_1),h\_b(k\_2)\<p \implies -p < h\_b(k\_1)-h\_b(k\_2)\<p
$$

which gives

$$
h\_b(k\_1)-h\_b(k\_2)=0\space(!!!!!!\mod p) \implies h\_b(k\_1)=h\_b(k\_2)
$$

We have the following equalities all taken modulo $$p$$

$$
h\_b(k\_1)-h\_b(k\_2)=\sum\_{j=0}^{d-1} k\_{1j}b^j-\sum\_{j=0}^{d-1} k\_{2j}b^j=\sum\_{j=0}^{d-1} (k\_{1j}-k\_{2j})b^j
$$

Following the hint from the book, according to Exercise 31.4-4 the above polynomial can have at most $$d-1$$ distinct zeros modulo $$p$$. In other words, at most $$d-1$$ choices of $$b$$ can lead to a collision. Therefore, our family of hash functions is $$\epsilon$$-universal for $$\epsilon = (d-1)/p$$.

## Exercise 11.4-1

Use [VisuAlgo](https://visualgo.net/en/hashtable) and select linear probing in the top menu bar. Choose the option in the left sidebar to create an empty hash table with $$m=11$$ slots and $$n=0$$ elements and press *Go*. Afterward, select the command to insert elements `10,22,31,4,15,28,17,88,59` and press *Go*.

For double hashing, the above website uses a different auxiliary hash function $$h\_2$$, so we cannot use it here. The hash table is filled as follows:

<table data-full-width="false"><thead><tr><th width="60.842987060546875" data-type="number">0</th><th width="59.1715087890625" data-type="number">1</th><th width="58.96295166015625" data-type="number">2</th><th width="60.88702392578125" data-type="number">3</th><th width="60.24249267578125" data-type="number">4</th><th width="61.2135009765625" data-type="number">5</th><th width="62.30450439453125" data-type="number">6</th><th width="62.05303955078125" data-type="number">7</th><th width="61.8988037109375" data-type="number">8</th><th width="60.26361083984375" data-type="number">9</th><th width="65.8997802734375" data-type="number">10</th></tr></thead><tbody><tr><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>null</td><td>4</td><td>null</td><td>null</td><td>null</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>null</td><td>4</td><td>15</td><td>null</td><td>null</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>null</td><td>4</td><td>15</td><td>28</td><td>null</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>17</td><td>4</td><td>15</td><td>28</td><td>null</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>null</td><td>17</td><td>4</td><td>15</td><td>28</td><td>88</td><td>null</td><td>31</td><td>10</td></tr><tr><td>22</td><td>null</td><td>59</td><td>17</td><td>4</td><td>15</td><td>28</td><td>88</td><td>null</td><td>31</td><td>10</td></tr></tbody></table>

## Exercise 11.4-2

`Hash-Delete` expects the key to be present in the hash table. Otherwise, there should be another check against `nil` for the returned index.

```
Hash-Delete(T, k)
    i = Hash-Search(T, k)
    T[i] = DELETED
```

The `Hash-Insert` procedure from the book requires a slight change in 4 line as follows:

```
if T[q] == nil or T[q] == DELETED
```

The `Hash-Search` doesn't need any change.

## Exercise 11.4-3

**Available in the latest revision of the IM.**

## Exercise 11.4-4

Instead of doing integral approximation (see page 300) we simply use the definition of the $$m$$th harmonic humber.

$$
\frac{1}{\alpha}\sum\_{k=m-n+1}^m \frac{1}{k}=\sum\_{k=1}^m \frac{1}{k}=H\_m
$$

## Exercise 11.4-5

This directly follows from Theorem 31.20, since $$|\langle h\_2(k)\rangle|=m/d$$.

## Exercise 11.4-6

Enter `solve 1/(1-x)=2/x ln(1/(1-x))` in [WolframAlpha](https://www.wolframalpha.com/) to get $$\alpha \approx 0.71533$$ by clicking on the *Approximate form* button inside the *Result* pane.

## Exercise 11.5-1

$$f\_a^{(0)}(k)$$ is an identity function, which is obviously injective. If we prove that $$f\_a=f\_a^{(1)}(k)$$ is injective (base case), then $$f\_a^{(r)}(k)=f\_a \circ f\_a^{(r-1)}(k)$$ for $$r>1$$ is also injective by induction on $$r$$. Note that the composition of two bijections is again a bijection (see solution for [Exercise B.3-4](https://evarga.gitbook.io/sh-intro-to-algs/appendices/appendix-b#exercise-b.3-4)).

We know that the $$swap$$ function only rearranges the bits of an input independently of it's value, therefore $$x=y \iff swap(x)=swap(y)$$.

Suppose, for the sake of contradiction, that there are two distinct inputs $$x$$ and $$y$$ such that $$f\_a(x) =f\_a(y)$$. WLOG assume that $$x>y$$, otherwise, exchange them. Thus, we have

$$
\begin{align\*}
&2x^2+ax = 2y^2+ay\space(!!!!!!\mod 2^{\omega}) \\
&\implies 2^{\omega} | (2x^2+ax) - (2y^2+ay) \\
&\implies 2^{\omega} | 2(x-y)(x+y)+a(x-y) \\
&\implies 2^{\omega} | (x-y)(2(x+y)+a) && \text{(contradiction!)}
\end{align\*}
$$

We have reached a contradiction, since $$a$$ is odd and $$2^{\omega} ∤ (x-y)$$. Recall that both $$x$$ and $$y$$ are $$\omega$$-bit words, hence $$x-y < 2^{\omega}$$. Consequently, $$f\_a$$ is injective.

## Exercise 11.5-2

A random oracle is equivalent to an independent uniform hash function family. Let us take 5 distinct keys $$k\_1,k\_2,k\_3,k\_4,k\_5$$. Each will be independently and uniformly mapped by a random oracle to a slot in the range $$\[0,m)$$. There are $$m^5$$ possible mappings, each with an equal probability of occurrence $$1/m^5$$. Thus, a random oracle is 5-independent according to the definition from the book on page 288.

Since a random oracle guarantees independence for any number of distinct inputs, it inherently satisfies 5-independence. 5-independence is a weaker condition than full independence; a fully independent function trivially satisfies $$k$$-independence for all $$k$$.

## Exercise 11.5-3

At least $$r=3$$ rounds are required to trigger an avalanche effect potentially affecting any bit of the output to flip. In the illustration below, light red colored cells could be affected by flips in previous rounds. After 3 rounds, all cells are marked red, hence may be impacted by flipping a single bit of the input value $$k$$. Observe that flipping the most-significant bit of $$k$$ only impacts the most-significant bit of $$g\_a(k)$$ in the first round. Any other bit of $$k$$ can only make the situation better. Hence, we consider a flip in bit position $$\omega-1$$ as the worst-case scenario. Note that round 0 is an identity function, so it is essentially the key itself. Each round consists of two parts: the upper is $$g\_a(k)$$ and the lower is $$f\_a(k)$$ of the corresponding round.

<img src="https://1997161-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FlBSM5MLFKTKipGIIw1fs%2Fuploads%2FM96kPpduH3d3vwhFtaFq%2Ffile.excalidraw.svg?alt=media&#x26;token=0b95dccd-90bb-4f7d-84d3-bbd39461825b" alt="" class="gitbook-drawing">

## Problem 11-1

**Available in the latest revision of the IM.**

## Problem 11-2

**Available in the latest revision of the IM.**

## Problem 11-3

**Available in the latest revision of the IM.**

## Problem 11-4

### a.

Arbitrarily pick two distinct keys $$x,y \in U$$. If $$\mathcal{H}$$ is 2-independent, then we can have $$m^2$$ different pairs $$(h(x),h(y))$$ of hash values with equal probability of occurrence $$1/m^2$$. Collisions are denoted by pairs $$(i,i)$$ for $$i=0,1,\dots,m-1$$.  Therefore, the probability of a collision for two distinct keys is $$m/m^2=1/m$$. Consequently, $$\mathcal{H}$$ is universal.

### b.

First, we show that is $$\mathcal{H}$$ universal. Arbitrarily pick two distinct keys $$x,y \in U$$. They must differ in at least one component. WLOG let it happen at index $$0 \le i < n$$. For a collision to occur, with a randomly picked hash function $$h\_a \in \mathcal{H}$$, we must have $$h\_a(x)=h\_a(y)$$. Using the definition of $$h\_a$$ we get

$$
\begin{align\*}
&\sum\_{j=0}^{n-1} a\_jx\_j = \sum\_{j=0}^{n-1} a\_jy\_j \space(!!!!!!\mod p) \\
&\implies \sum\_{j=0}^{n-1} a\_j(x\_j-y\_j)=0 \space(!!!!!!\mod p) \\
&\implies a\_i(x\_i-y\_i)=-\sum\_{\substack{0\le j\<n\i \neq j}} a\_j(x\_j-y\_j) \space(!!!!!!\mod p) \\
\end{align\*}
$$

Observe that $$-(p-1) \le x\_i-y\_i \le p-1$$ and $$x\_i-y\_i \neq 0$$ implies that $$x\_i-y\_i$$ has a multiplicative inverse modulo $$p$$. Therefore,

$$
\begin{equation}
a\_i=-(x\_i-y\_i)^{-1}\sum\_{\substack{0\le j\<n\i \neq j}} a\_j(x\_j-y\_j)
\end{equation}
$$

Notice that $$a\_i$$ is fully determined based on the choice of $$x$$, $$y$$ and other components of $$a$$. Thus, the probability of a collision is $$1/p$$, since for a randomly selected $$n$$-tuple $$a$$, there are $$p$$ possible values at index $$i$$ and only one leads to a collision. Furthermore, if $$h\_a(x)=h\_a(y)$$ then all coefficients $$a\_k$$, associated with corresponding nonzero terms $$x\_k-y\_k$$, simultaneously satisfy (1) (just set $$i=k)$$. This shows that $$\mathcal{H}$$ is universal.

Following the hint from the book, it is easy to see that $$x\_0=\langle0,0,\dots,0\rangle$$ always hashes to zero. Therefore, the condition to have $$p^2$$ unique pairs of hash values, with equal probability of occurrence, for any two distinct keys is not satisfied. Consequently, $$\mathcal{H}$$ is not 2-independent.

### c.

Taking the hint from the book, let's see what happens to $$h\_{ab}^{'}(x)$$ as $$a\_i$$ and $$b$$ range over $$\Z\_p$$. The same reasoning will also apply to $$h\_{ab}^{'}(y)$$. Let  $$c=\sum\_{\substack{0\le j\<n\i \neq j}} a\_jx\_j$$, since we keep all other parameters fixed. This constant $$c$$ has no influence on the subsequent analysis.&#x20;

We must show that $$h\_{ab}^{'}(x)$$ can take any value from $$\Z\_p$$ and equal number of distinct pairs $$(a\_i,b)$$ map to each such output. There are two cases to consider:

* If $$x\_i=0$$ then each value of $$b$$ produces a unique hash output, so it may range over $$\Z\_p$$. For every $$b$$ we can freely assign $$p$$ possible values to $$a\_i$$. In other words, $$p$$ distinct pairs map to each distinct hash value.
* Otherwise, according to theorem 31.20, we know that $$|\langle x\_i \rangle|=p$$, since $$x\_i$$ is relatively prime to $$p$$. In this case, the parameter $$b$$ only shifts the full sequence of $$p$$ values generated by $$x\_i$$. There are $$p$$ such shifts. Thus, the output may range over $$\Z\_p$$ and again $$p$$ distinct pairs $$(a\_i,b)$$ map to each distinct hash value.

We can conclude that $$\mathcal{H}^{'}$$ is 2-independent.

### d.

Since $$\mathcal{H}$$ is 2-independent, we can have $$p^2$$ different pairs of hash values, with equal probability of occurrence $$1/p^2$$, for two distinct keys $$x,y \in U$$. The adversary may fool Bob, assuming that $$m \neq m'$$, if $$h'(m')=h(m')=t'$$, where $$h$$ denotes the shared hash function between Alice and Bob. All the adversary can do is to find a hash function $$h'$$ such that $$h'(m)=h(m)=t$$, hoping that it will also collide on $$m'$$ with $$h$$. There are $$p$$ pairs of possible hash values whose first component is $$t$$. The probability that $$h'$$ will also generate the same second component as $$h$$ for $$m'$$ is $$1/p$$. Therefore, no matter how much computing power the adversary has, her/his probability of success is at most $$1/p$$.

As a side note, the previous analysis may also be carried out in an opposite direction. Namely, to ask what is the probability that Alice and Bob had selected a random hash function that hashes to the same value as $$h'$$ for $$m'$$. Among those that produce $$t$$ for $$m$$ only $$(1/p)$$th generate $$t'$$ for $$m'$$. So, we arrive at the same conclusion.
