Text-learning and intelligent agents

Dunja Mladenic

The paper gives overview of some of the recent work in intelligent agents, describing the two frequently used approaches: content-based and collaborative approach. The usage of machine learning techniques on text databases (usually referred to as text-learning) is an important part of content-based intelligent agents that work on text documents. The most popular among them are agents for locating information on World Wide Web and Usenet news filtering agents. Despite the popularity, there is not much work on finding the most suitable machine learning techniques to be used in text-learning on that domains. This paper gives an overview of some work in text-learning through the prism of the three research questions important for development of text-learning intelligent agents: what representation is used for documents, how is the high number of features dealt with and which learning algorithm is used. Brief description and inside structure of content-based intelligent agent named Personal WebWatcher that uses text-learning for user customized Web browsing is given as an example.