The same experiment was done on SENSEVAL-2 English lexical-sample task data and the results are shown in Table 19. The details of how the different systems were built can be consulted in Section
Again, we can see in Table 19 that BFS per POS is better than per word, mainly because the same reasons explained in Section 3.3.
Nevertheless, the improvement on nouns by using the vME+SM system is not as high as for the Spanish data. The differences between both corpora have a significant relevance about the precision values that can be obtained. For example, the English data includes multi-words and the sense inventory is extracted from WordNet, while the Spanish data is smaller and a dictionary was built for the task specifically, having a smaller polysemy degree.
The results of vME+SM are comparable to the systems presented at SENSEVAL-2 where the best system (Johns Hopkins University) reported 64.2% precision (68.2%, 58.5% and 73.9% for nouns, verbs and adjectives, respectively).
Comparing these results with those obtained in section 4.2, we also see that using a voting system with the best feature selection for ME and Specification Marks vME+SM, and using a non-optimized ME with the relevant domain heuristic, we obtain very similar performance. That is, it seems that we obtain comparable performance combining different classifiers resulting from a feature selection process or using a richer set of features (adding the domain features) with much less computational overhead.
This analysis of the results from the SENSEVAL-2 English and Spanish lexical-sample tasks demonstrates that knowledge-based and corpus-based WSD systems can cooperate and can be combined to obtain improved WSD systems. The results empirically demonstrate that the combination of both approaches outperforms each of them individually, demonstrating that both approaches could be considered complementary.