data:image/s3,"s3://crabby-images/9a680/9a6808b05760abd14d788523b98aae7ec8c01c9a" alt=""
3. When the step consists of an equation expression the previous step is re-
peated but now for the value instead of the name.
4 Experimental Data
The two prototypes give us the opportunity to experiment with the parameters
used in the protocol and, more importantly, compare the linear search approach
with the tree search approach. We are especially interested in the influence the
approach and the parameters n and m have on the encryption and search speed.
We used the XML benchmark
1
[5] to generate three sample XML files of sizes
1 MB, 10 MB and 100 MB. Although the linear approach does not use the
structure of these XML files the benchmark is used in both cases to compare the
results with the tree search approach.
Also the number of collisions has been measured (see figure 2(a)). Collisions
are the false hits that occur because of the collisions in the hash function F . F
hashes the random value S
i
of size n−m to a hash value of length m, where n −
m ≥ m. Therefore collisions are unavoidable (collisions are avoidable when n −
m = m and F is bijective, but bijective functions are not good hash functions).
4.1 Experiments with the Linear Search Prototype
For the linear prototype both n and m may be chosen freely. Tests are carried
out ∀n ∈ {8, 16, 24, 32, 40, 48, 56, 64} where these values are the number of bytes
and not bits. Because we use DES in ECB mode for the encryption function
E, we only use multiples of 8 bytes. m should be less than or equal to
n
2
so
m ∈ {1, 2, . . . ,
n
2
} (also in bytes). Measurement results of the 100 MB case are
plotted in figure 2(b). Tests with data inputs of 1 MB and 10 MB showed that
the numb er of collisions, the search and the encryption times are proportional
to the data size. In our technical report [6] more experimental data is provided.
All tests were carried out on a Pentium IV 2.4 MHz with 512 MB memory.
For the search query a word guaranteed to be in at least one location was
chosen. The search engine does not stop when one occurrence is found; all the
text is scanned for each query.
4.2 Experiments with the Tree Search Prototype
For the tree search prototype the only configurable parameters are m and the
data size. The block length n depends on the tag names and values. Encryption
tests are carried out on the same XML do cuments as in the linear prototype. In
this case m is relative to n; m ∈ {0.1, 0.2, 0.3, 0.4, 0.5}. The encryption times for
the 1 MB, 10 MB and the 100 MB files were 21.5, 188 and 1195 s and did not
depend on m.
Search tests were carried out with a fixed m = 0.5 because m does not seem
to have much influence. Some queries are shown in table 1. Also the number of
1
http://www.xml-benchmark.org
132