Instance Acquisition refers to extracting instances of a given semantic class name (e.g., car makers => ford, nissan, toyota). ASIA extracts set instances by utilizing hearst patterns along with the state-of-the-art set expansion technique implemented in SEAL (see below). ASIA currently supports input in multiple languages, including Chinese, Japanese, as well as English.
Set Expansion refers to expanding a given partial set of objects into a more complete set (e.g., ford, nissan => toyota, audi, buick). A well-known example system that does set expansion using the web is Google Sets. SEAL uses a novel method for expanding sets of named entities. The approach can be applied to semi-structured documents written in any markup language and in any human language.
Andrew Carlson, Justin Betteridge, Richard C. Wang, Estevam R. Hruschka Jr. and Tom M. Mitchell: Coupled Semi-Supervised Learning for Information Extraction. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010), New York (Brooklyn), New York, USA. 2010.
Richard C. Wang and William W. Cohen: Automatic Set Instance Extraction using the Web. In Proceedings of Joint Conference of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Suntec City, Singapore. 2009.