Balanced trees provide a wide variety of low-cost operations on ordered sets and dynamic dictionaries. Of the many types of balanced trees that have been developed over the years, randomized search trees (treaps)[4] have the advantage of being both simple and general--in addition to insertion and deletion they easily support efficient finger searching, joining, and splitting. Furthermore, by using appropriate hash functions they require no balance information to be stored at the nodes. Treaps, however, have only been studied in the context of sequential algorithms, and there has been little study of their performance.

Our interest is in developing fast parallel algorithms for
aggregate set operations * intersection*, * union*, and
* difference*. These operations play an important role in
databases queries and index searching--each term (word) can be
represented as a set of the ``documents'' it appears in and searches
on logical conjunctions of terms are implemented as set operations
(intersection for ** and**, union for ** or
**, and difference for ** and-not**)[5,3]. Searching,
inserting, and deleting multiple elements from a dictionary in parallel
have similar algorithms.

Treaps are randomized
search trees in which each node in the tree has a key and an
associated random priority. The nodes are organized so that the keys
appear in in-order and the priorities appear in heap-order. Our set
operation algorithms based on treaps run in optimal *O(m* log
*(n/m))* work and in *O(*log *n)* depth
(parallel time) on an EREW PRAM with unit-time scan operations (used
for load balancing)--all expected case. This bound is based on
automated pipelining)[1]. Without
pipelining the algorithms run in *O(*log^{2}*
n)* depth. Previous work has focused either on parallel merging
algorithms that take *O(n+m)* work and are optimal when the
two sets have nearly equal sizes or on multi-insertion algorithms
that take *O(m* log *n)* work and are optimal when the
input values to be inserted are not presorted.

To analyze the effectiveness of the algorithms in practice we ran several experiments. We were interested in various properties including how well treaps compare sequentially with other balanced trees such as red-black trees, splay trees, and skip lists, how well our algorithms perform for various overlaps in the keys, and how well the algorithms parallelize. The parallel implementation and memory management was coded in Cilk, a parallel extension of C, and run on a Sun Ultra Enterprise and a SGI Power Challenge. The results of this work is summarized in [2].

**1**-
G. Blelloch and M. Reid-Miller
Pipelining with Futures.
In
*Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures*, pages 249-259, Newport, RI, June 1997 **2**-
G. Blelloch and M. Reid-Miller
Fast Set Operations Using Treaps.
In
*Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures*, pages 16-26, Puerto Vallarta, Mexico, June 1998. **3**-
M.L. Mauldin.
Lycos: Design choices in an Internet search service
*IEEE Expert*, 12(1):8-11, 1997. **4**-
R. Seidel and C.R. Aragon.
Randomized Search Trees.
*Algorithmica*, 16:464-497, 1996. **5**-
I.H. Witten, A. Moffat, and T.C. Bell.
*Managing Gigabytes: Compressing and Indexing Documents and Images*. Van Nostrand Reinhold.

Up to the Irregular Algorithms page or the Scandal page.

mrmiller@cs.cmu.edu. Last modified: Tue Jul 14 15:23:01 EDT 1998