Integrating a Knowledge-based WSD system into a Corpus-based WSD system

In this case, we used only the domain heuristic to improve ME because this information can be added directly as domain features. The problem of data sparseness from which a WSD system based on features suffers could be increased by the fine-grained sense distinctions provided by WordNet. On the contrary, the domain information significantly reduces the word polysemy (i.e., the number of categories for a word is generally lower than the number of senses for the word) and the results obtained by this heuristic have better precision than those obtained by the whole SM method, which in turn obtain better recall.

As shown in subsection 3.1.7, the domain heuristic can annotate word senses characterized by their domains. Thus, these domains will be used as an additional type of features for ME in a context window of $\pm1$, $\pm2$, and $\pm3$ from the target word. In addition, the three more relevant domains were calculated also for each context and incorporated to the training in the form of features.

This experiment was also carried out on the English lexical-sample task data from SENSEVAL-2, and ME was used to generate two groups of classifiers from the training data.

The first group of classifiers used the corpus without information of domains; the second, having previously been domain disambiguated with SM, incorporating the domain label of adjacent nouns, and the three more relevant domains to the context. That is, providing to the classifier a richer set of features (adding the domain features). However, in this case, we did not perform any feature selection.

The test data was disambiguated by ME twice, with and without SM domain labelling, using $0lWsbcpdm$ (see Figure 9) as the common set of features in order to perform the comparison. The results of the experiment are shown in Table 16.

Table 16: Precision Results Using ME to Disambiguate Words, With and Without Domains (recall and precision values are equal)
Target words Without domains With domains Improvement


0.667 0.778 0.111
authority 0.600 0.700 0.100
bar 0.625 0.615 -0.010
bum 0.865 0.919 0.054
chair 0.898 0.898
channel 0.567 0.597 0.030
child 0.661 0.695 0.034
church 0.560 0.600 0.040
circuit 0.408 0.388 -0.020
day 0.676 0.669 -0.007
detention 0.909 0.909
dyke 0.800 0.800
facility 0.429 0.500 0.071
fatigue 0.850 0.850
feeling 0.708 0.688 -0.021
grip 0.540 0.620 0.080
hearth 0.759 0.793 0.034
holiday 1.000 0.957 -0.043
lady 0.900 0.900
material 0.534 0.552 0.017
mouth 0.569 0.588 0.020
nation 0.720 0.720
nature 0.459 0.459
post 0.463 0.512 0.049
restraint 0.516 0.452 -0.065
sense 0.676 0.622 -0.054
spade 0.765 0.882 0.118
stress 0.378 0.378
yew 0.792 0.792
All 0.649 0.669 0.020

The table shows that 7 out of 29 nouns obtained worse results when using the domains, whereas 13 obtained better results. Although, in this case, we only obtained a very small improvement in terms of precision (2%)7.

We obtained important conclusions about the relevance of domain information for each word. In general, the larger improvements appear for those words having well-differentiated domains (spade, authority). Conversely, the word stress with most senses belonging to the FACTOTUM domain do not improves at all. For example, for spade, art and authority (with an accuracy improvement over 10%) domain data seems to be an important source of knowledge with information that is not captured by other types of features. For those words for which precision decrease up to 6.5%, domain information is confusing. Three reasons can be exposed in order to explain this behavior: there is not a clear domain in the examples or they do not represent correctly the context, domains do not differentiate appropriately the senses, or the number of training examples is too low to perform a valid assessment. A cross-validation testing, if more examples were available, could be appropriate to perform a domain tuning for each word in order to determine which words must use this preprocess and which not.

Nevertheless, the experiment empirically demonstrates that a knowledge-based method, such as the domain heuristic, can be integrated successfully into a corpus-based system, such as maximum entropy, to obtain a small improvement.


... (2\%)7
This difference proves to be statistically significant when applying the test of the corrected difference of two proportions [DietterichDietterich1998,Snedecor CochranSnedecor Cochran1989]