This paper addresses the problem of subgroup discovery which can be
defined as: given a population of individuals and a property of those
individuals we are interested in, find population subgroups that are
statistically `most interesting', e.g., are as large as possible and
have the most unusual statistical (distributional) characteristics
with respect to the property of interest
(Klösgen, 1996; Wrobel, 1997, 2001). Its main contribution is a
new methodology supporting the process of expert-guided subgroup
discovery. Specifically, we introduce a novel parametrized definition
of rule quality used in a heuristic beam search algorithm, a rule
subset selection algorithm incorporating example weights, the
detection of statistically significant properties of selected
subgroups, and a novel subgroup visualization method. An in-depth
analysis of the proposed quality measure is provided as well. The
proposed methodology has been applied to the medical problem of
detecting and describing patient groups with high risk for
artherosclerotic coronary heart disease (CHD).^{1}

The paper organization is as follows. Algorithms for subgroup detection and selection, which are the main ingredients of the expert-guided subgroup discovery methodology, are described in Section 2. Section 3 presents: the coronary heart disease risk group detection problem, the discovered patient risk groups, their statistical characterization, visualization, medical interpretation and evaluation, including a discussion on the expert's role in the subgroup discovery process. Section 4 provides an in-depth analysis of the proposed rule quality measure for subgroup discovery including an experimental comparison with a selected cost-based quality measure. Finally, Section 5 provides links to the related work.