Our strategy for initial clustering is sorting, which is a term adapted from a psychological task that requires subjects to perform roughly the same procedure that we describe here [Ahn & Medin, 1989]. Given an observation and a current partition, sorting evaluates the quality of new clusterings that result from placing the observation in each of the existing clusters, and the quality of the clustering that results from creating a new cluster that only covers the new observation; the option that yields the highest quality score (e.g., using PU) is selected. The clustering grows incrementally as new observations are added.
This procedure is easily incorporated into a recursive loop that builds tree-structured clusterings: given an existing hierarchical clustering, an observation is sorted relative to the top-level partition (i.e., children of the root); if an existing child of the root is chosen to include the observation, then the observation is sorted relative to the children of this node, which now serves as the root in this recursive call. If a leaf is reached, the tree is extended downward. The maximum height of the tree can be bounded, thus limiting downward growth to fixed depth. Figure 2 shows the tree of Figure 1 after two new observations have been added to it: one observation extends the left subtree downward, while the second is made a new leaf at the deepest, existing level of the right subtree.
Figure 2: An updated probabilistic categorization tree.
This sorting strategy is identical to that used by Anderson and Matessa [Anderson & Matessa, 1991]. The children of each cluster partition the observations that are covered by their parent, though the measure, PU, used to guide sorting differs from that of Anderson and Matessa. The observations themselves are stored as singleton clusters at leaves of the tree. Other hierarchical-sort based strategies augment this basic procedure in a manner described in Section 3.3 [Fisher1987a; Hadzikadic & Yun, 1989; Decaestecker1991].