|
|
There
is a positive correlation between news reports on a company's
financial outlook and the company's attractiveness as an
investment. However,
because of the volume of such reports, it is impossible
for financial analysts or investors to track and read all
of them.
A system
that automatically classifies news reports that reflect
positively or negatively on a company's financial outlook
would greatly benefit analysts and investors. In the application
domain of stock portfolio management (see Warren),
software agents that evaluate the risks associated with
the individual companies of a portfolio should be able to
read, classify and weigh electronic news articles, to give
investors an indication of the financial outlook of a company.
To accomplish
this task, we treat the unsupervised reading and understanding
of news articles as an automatic text classification problem.
In this project, we are developing an automatic text classifying
technique resulting in software agents that we call "Domain
Experts." Domain Experts use a sampling algorithm --
based on Weighted Majority algorithm -- that make use of
frequently co-occurring phrases as their feature vector.
We call
the sampling technique used by Domain Experts "self-confident"
sampling. Briefly, "self-confident" sampling is a technique
for sampling more promising data from unlabeled data sets
to improve a classifier's performance. The "self-confident"
sampling is a kind of pseudo labeling method that predicts
a label for unlabeled data on the basis of the entropy value
of unlabeled data, and the trainer's confidence, which is
acquired during training phase.
Instructions
for access to TextMiner:
To receive access to TextMiner, please print
the the
CMU License Agreement
(.pdf).
Click
here
to download Adobe Reader.
- Read
carefully and if you agree to the terms, complete the bottom
portion of the Agreement. Include your name, institutional
affiliation and address, a url for the website that describes
your group's or your own research activities, your email
address, and, if you are a student, the name, position,
url and email address of your advisor. Please sign and date
the agreement.
- Send
the completed agreement to us by mail at:
Katia Sycara
The Robotics Institute
5000 Forbes Avenue
PIttsburgh, PA 15213
- We
will send qualified users a user name and password via email,
so that you can access the executable by downloading Communicator
Library v1.4.1_Apr2003 (Jar) from the downloads page, here.
Download
GoodNews (TextMiner)
Interface, and instructions for use.
Access
the Data for TextMiner Testing/Training
Publications
Young-Woo
Seo, Joseph Giampapa, and Katia Sycara, Financial
news analysis for intelligent portfolio management, Tech.
Report CMU-RI-TR-04-04, Robotics Institute, Carnegie Mellon
University, Jan. 2004.
Young-Woo
Seo, Joseph A. Giampapa, and Katia P. Sycara, Text
classification for intelligent agent portfolio management,
Tech. Report CMU-RI-TR-02-14, Robotics Institute, Carnegie
Mellon University, May 2002.
Robotics
Institute Project Page
|
|