You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(i.e., the level is the number of trailing zeroes in the rolling checksum in excess of the threshold needed to produce the prefix chunk $P(X)$).
146
+
147
+
(Note:
148
+
When $|R(X)| > 0$,
149
+
$L(X)$ is non-negative,
150
+
because $P(X)$ is defined in terms of a hash with $T$ trailing zeroes.
151
+
But when $|R(X)| = 0$,
152
+
that hash may have fewer than $T$ trailing zeroes,
153
+
and so $L(X)$ may be negative.
154
+
This makes no difference to the algorithm below, however.)
155
+
156
+
A “node” in a hashsplit tree
157
+
is a pair $(D, C)$
158
+
where $D$ is the node’s “depth”
159
+
and $C$ is a sequence of children.
160
+
The children of a node at depth 0 are chunks
161
+
(i.e., subsequences of the input).
162
+
The children of a node at depth $D > 0$ are nodes at depth $D - 1$.
163
+
164
+
The function $\operatorname{Children}(N)$ on a node $N = (D, C)$ produces $C$
165
+
(the sequence of children).
166
+
167
+
## Algorithm
168
+
169
+
To compute a hashsplit tree from sequence $X$,
170
+
compute its “root node” as follows.
171
+
172
+
1. Let $N_0$ be $(0, \langle\rangle)$ (i.e., a node at depth 0 with no children).
173
+
2. If $|X| = 0$, then:
174
+
a. Let $d$ be the largest depth such that $N_d$ exists.
175
+
b. If $|\operatorname{Children}(N_0)| > 0$, then:
176
+
i. For each integer $i$ in $[0 .. d]$, “close” $N_i$.
177
+
ii. Set $d \leftarrow d+1$.
178
+
c. [pruning] While $d > 0$ and $|\operatorname{Children}(N_d)| = 1$, set $d \leftarrow d-1$ (i.e., traverse from the prospective tree root downward until there is a node with more than one child).
179
+
d. **Terminate** with $N_d$ as the root node.
180
+
3. Otherwise, set $N_0 \leftarrow (0, \operatorname{Children}(N_0) \mathbin{\|} \langle P(X) \rangle)$ (i.e., add $P(X)$ to the list of children in $N_0$).
181
+
4. For each integer $i$ in $[0 .. L(X))$, “close” the node $N_i$ (see below).
182
+
5. Set $X \leftarrow R(X)$.
183
+
6. Go to step 2.
184
+
185
+
To “close” a node $N_i$:
186
+
187
+
1. If no $N_{i+1}$ exists yet, let $N_{i+1}$ be $(i+1, \langle\rangle)$ (i.e., a node at depth ${i + 1}$ with no children).
188
+
2. Set $N_{i+1} \leftarrow (i+1, \operatorname{Children}(N_{i+1}) \mathbin{\|} \langle N_i \rangle)$ (i.e., add $N_i$ as a child to $N_{i+1}$).
189
+
3. Let $N_i$ be $(i, \langle\rangle)$ (i.e., new node at depth $i$ with no children).
0 commit comments