Rayid Ghani
Accenture Technology
Labs
(312) 693-6653
rayid.ghani@accenture.com
http://www.accenture.com/techlabs/ghani
Research Interests:
=Machine
Learning
=Data
Mining
=Semi-Supervised
Learning – Combining labeled and unlabeled data to improve statistical models.
=Text
Mining - Applying Machine Learning and Statistical techniques to automatic text
classification and information extraction.
Research
Publications:
Predicting the End-Prices of Online Auctions. Rayid Ghani and
Hillery Simmons. Workshop on Data Mining & Adaptive Modeling Methods for Economics
& Management - held with the European
Conference on Machine Learning (ECML/PKDD 2004).
Mining the Web to Add Semantics to Retail Data Mining. Rayid Ghani. Invited Chapter. “Web Mining: From Web
to Semantic Web”. Springer Lecture
Notes in Artificial Intelligence , Vol. 3209. Berendt, B.; Hotho, A.;
Mladenic, D.; van Someren, M.; Spiliopoulou, M.; Stumme, G. (Eds.) 2004.
Predicting Customer Grocery Shopping Lists from
Active learning for information extraction with
multiple view feature sets. Rosie
Jones, Rayid Ghani, Tom Mitchell and Ellen Riloff. Workshop on Adaptive Text Extraction and Mining at European Conference
on Machine Learning (ECML/PKDD 2003).
Building Minority Language Corpora by Learning to
Generate Web Search Queries. Rayid
Ghani, Rosie Jones and Dunja Mladenic. Journal
of Knowledge and Information Systems (KAIS), 2003.
Using Text Mining to Infer Semantic Attributes for
Retail Data Mining. Rayid Ghani,
Andrew Fano. IEEE International Conference
on Data Mining (ICDM 2002)
Building Recommender Systems using a Knowledge Base of
Product Semantics. Rayid Ghani, Andrew Fano.
Workshop on Recommender Systems and Personalization in Ecommerce at 2nd
International Conference on Adaptive Hypermedia and Adaptive Web Based Systems (2002)
Combining Labeled and
Unlabeled for Multiclass Text Classification. Rayid Ghani. Proceedings of the
19th International
Conference on Machine Learning (ICML 2002).
A Comparison of Efficacy of Bootstrapping Algorithms
for Information Extraction. Rayid Ghani and Rosie Jones. Proceedings on the Workshop on
Linguistic Knowledge Acquisition at the Linguistic Resources and Evaluation
Conference (LREC 2002).
Web Mining for Automatic Corpus Construction. Rayid Ghani, Rosie Jones, and
Dunja Mladenic. Knowledge and Information Sciences Journal – Special Issue
on Information and Knowledge Management (2003).
A Study of Approaches
for Hypertext Categorization. Yiming Yang, Sean
Slattery and Rayid Ghani. Journal of Intelligent Information Systems -
Special Issue on Automatic Text Categorization (2002).
Hypertext Categorization
using Hyperlink Patterns and Meta Data. Rayid Ghani, Sean
Slattery, and Yiming Yang. Proceedings of the 18th International Conference
on Machine Learning (ICML 2001).
Using Error-Correcting
Codes for Efficient Text Classification with a Large Number of Categories. Rayid Ghani. Masters Thesis. Center for Automated Learning &
Discovery,
Combining Labeled and
Unlabeled data for Text Classification with a Large Number of Categories. Rayid Ghani. Proceedings of the First IEEE Conference on Data Mining
(ICDM 2001)
Using Error-Correcting
Codes and Co-Training for Text Classification with a Large Number of Categories. Rayid Ghani. Workshop on Text Mining at the First IEEE Conference on
Data Mining (2001)
Building Minority
Language Corpora by Learning to Generate Web Search Queries. Rayid Ghani, Rosie Jones, and Dunja Mladenic.
Using the Web to Create
Minority Language Corpora. Rayid Ghani, Rosie Jones, and
Dunja Mladenic. Proceedings of the Tenth International Conference on
Information and Knowledge Management (CIKM 2001).
Online Learning for
Query Generation: Finding Documents Matching a Minority Concept on the Web. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Proceedings of the
First International Conference on Web Intelligence (2001).
Automatic Web Search
Query Generation to Create Minority Language Corpora.. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Proceedings of the
24th Annual International ACM SIGIR Conference on Research and Development in
Information Retrieval (SIGIR 2001).
Using Error-Correcting
Codes for Text Classification. Rayid Ghani. Proceedings
of the 17th International Conference on Machine Learning (ICML 2000).
Analyzing the
Effectiveness and Applicability of Co-Training. Kamal Nigam & Rayid Ghani. Proceedings
of the Ninth International Conference on
Information and Knowledge Management (CIKM 2000).
Understanding the
Behavior of Co-Training. Kamal Nigam & Rayid Ghani. Proceedings
of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (KDD-2000).
Learning a Monolingual
Language Model from a Multilingual Text Database. Rayid Ghani & Rosie Jones. Proceedings of the Ninth
International Conference on Information and Knowledge Management (CIKM
2000).
Automatically Building a
Corpus for a Minority Language From the Web. Rosie Jones
& Rayid Ghani. Proceedings of the Student Workshop at the 38th Annual
Meeting of the Association for Computational Linguistics (ACL-2000).
Data Mining on Symbolic
Knowledge Extracted from the Web. Rayid Ghani, Rosie
Jones, Dunja Mladenic, Kamal Nigam, Sean Slattery. Proceedings of the
Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD-2000).
Research Community Activities:
Member of the Advisory Board for the European
Invited Speaker: Web Mining Workshop at European
Conference on Machine Learning & Principles of Data Mining (ECML/PKDDD
2003)
Organizer, Workshop on “The
Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining” –
held with International Conference on Machine Learning, 2003
Organizer, Workshop on
Operational
Program Committee Member
ACM Conference on Research and Development in
Information Retrieval (SIGIR 2004)
International Conference on Machine Learning ICML – 2003 and 2004
Link Discovery
Workshop (LinkKDD) held with Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, 2004
Web Mining Workshop
(WebKDD) held with Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, 2004
Adaptive
Text Learning Workshop at the International Conference on Machine
Learning (ICML 2002)
Operational Text
Classification Workshop at the ACM
SIGIR Conference on Research and Development in Information Retrieval
(SIGIR 2002)
Text Mining Workshop at the IEEE Conference on Data Mining, 2001.
Reviewer
International Conference on Machine Learning (ICML 2001).
Journal of Artificial Intelligence Research (JAIR)
Journal of Machine Learning Research (JMLR)
Education:
M.S. in Knowledge Discovery & Data Mining
May 2001
Advisor: Tom Mitchell
Coursework in Machine
Learning, Text Mining, Information Retrieval, Advanced Information Retrieval
Seminar, Data Mining in Multimedia Databases, Statistical Approaches for
Learning & Discovery, Advanced AI Concepts
University of the South,
B.S. (with Honors) Summa Cum Laude, May 1999.
Majors: Computer Science, Mathematics.
Research
Experience:
Researcher, Accenture Technology Labs, Accenture,
Member of the Research
Group at Accenture Technology Labs. Conducting research in the areas of Machine
Learning, Data Mining and Text Mining. The Labs’ goals include researching and
inventing the next wave of business solutions using new and emerging
technologies and exploring how these will evolve, converge and
shape businesses in the future. Recent projects include automatically
constructing knowledge bases of product semantics using text learning
techniques, semi-supervised information extraction and active learning.,
constructing
Researcher, Center for Automated
Learning & Discovery,
Researched the use of Machine Learning techniques for Intrusion
Detection. Joint work with Roy Maxion, Director of the Dependable Systems Lab,
Researcher - Text Mining
& Computational Linguistics Group,
Research in text classification and learning
with unlabeled data.
Graduate Research –
WebKB Project, Center
for Automated Learning & Discovery, Carnegie Mellon University, Pittsburgh,
PA. August 1999 – May 2001
Performed research in the areas of statistical
text learning, learning with unlabeled data, and information extraction. Part
of the WebKB group whose aim was to build a web-crawling system that extracts
information from the Web into a propositional knowledge base, using a variety
of learned text classifiers and information extractors. Research funded by
DARPA and CIA.
Graduate Research– fMRI Brain
Imaging Analysis Project, Center for Automated Learning & Discovery,
Worked on developing and using Machine
Learning Algorithms to analyze fMRI images of the human brain in collaboration
with Center for Cognitive Brain Imaging at
Society Memberships:
Member, American
Association for Artificial Intelligence (AAAI)
Member, Association
for Computing Machinery (ACM)
Member, IEEE
Member, Graduate
Admissions Committee, Center for Automated Learning & Discovery ,