Rayid Ghani

Accenture Technology Labs

161 N. Clark St.

Chicago, IL 60601

(312) 693-6653

rayid.ghani@accenture.com

http://www.accenture.com/techlabs/ghani

 

Research Interests:

=Machine Learning

=Data Mining

=Semi-Supervised Learning – Combining labeled and unlabeled data to improve statistical models.

=Text Mining - Applying Machine Learning and Statistical techniques to automatic text classification and information extraction.

 

Research Publications:

 

Predicting the End-Prices of Online Auctions.  Rayid Ghani and Hillery Simmons.  Workshop on Data Mining & Adaptive Modeling Methods for Economics & Management  - held with the European Conference on Machine Learning (ECML/PKDD 2004).

Mining the Web to Add Semantics to Retail Data Mining. Rayid Ghani. Invited Chapter. “Web Mining: From Web to Semantic Web”. Springer Lecture Notes in Artificial Intelligence , Vol.  3209. Berendt, B.; Hotho, A.; Mladenic, D.; van Someren, M.; Spiliopoulou, M.; Stumme, G. (Eds.) 2004.

Predicting Customer Grocery Shopping Lists from Point-of-Sale Purchse Data.  Chad Cumby, Andrew Fano, Rayid Ghani, and Marko Krema.  Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004).

Active learning for information extraction with multiple view feature sets. Rosie Jones, Rayid Ghani, Tom Mitchell and Ellen Riloff. Workshop on Adaptive Text Extraction and Mining at European Conference on Machine Learning (ECML/PKDD 2003).

Building Minority Language Corpora by Learning to Generate Web Search Queries. Rayid Ghani, Rosie Jones and Dunja Mladenic. Journal of Knowledge and Information Systems (KAIS), 2003.

Using Text Mining to Infer Semantic Attributes for Retail Data Mining. Rayid Ghani, Andrew Fano. IEEE International Conference on Data Mining (ICDM 2002)

Building Recommender Systems using a Knowledge Base of Product Semantics. Rayid Ghani, Andrew Fano. Workshop on Recommender Systems and Personalization in Ecommerce at 2nd International Conference on Adaptive Hypermedia and Adaptive Web Based Systems (2002)

Combining Labeled and Unlabeled for Multiclass Text Classification. Rayid Ghani. Proceedings of  the 19th International Conference on Machine Learning (ICML 2002).

A Comparison of Efficacy of Bootstrapping Algorithms for Information Extraction. Rayid Ghani and Rosie Jones. Proceedings on the Workshop on Linguistic Knowledge Acquisition at the Linguistic Resources and Evaluation Conference (LREC 2002).

Web Mining for Automatic Corpus Construction. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Knowledge and Information Sciences Journal – Special Issue on Information and Knowledge Management (2003).

A Study of Approaches for Hypertext Categorization. Yiming Yang, Sean Slattery and Rayid Ghani. Journal of Intelligent Information Systems - Special Issue on Automatic Text Categorization (2002).

Hypertext Categorization using Hyperlink Patterns and Meta Data. Rayid Ghani, Sean Slattery, and Yiming Yang. Proceedings of the 18th International Conference on Machine Learning (ICML 2001).

Using Error-Correcting Codes for Efficient Text Classification with a Large Number of Categories. Rayid Ghani. Masters Thesis. Center for Automated Learning & Discovery, Carnegie Mellon University (2001) 

Combining Labeled and Unlabeled data for Text Classification with a Large Number of Categories. Rayid Ghani. Proceedings of the First IEEE Conference on Data Mining (ICDM 2001)

Using Error-Correcting Codes and Co-Training for Text Classification with a Large Number of Categories. Rayid Ghani. Workshop on Text Mining at the First IEEE Conference on Data Mining (2001)

Building Minority Language Corpora by Learning to Generate Web Search Queries. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Carnegie Mellon University Center for Automated Learning and Discovery Technical Report CMU-CALD-01-100 (2001)

Using the Web to Create Minority Language Corpora. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Proceedings of the Tenth International Conference on Information and Knowledge Management (CIKM 2001).

Online Learning for Query Generation: Finding Documents Matching a Minority Concept on the Web. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Proceedings of the First International Conference on Web Intelligence (2001).

Automatic Web Search Query Generation to Create Minority Language Corpora.. Rayid Ghani, Rosie Jones, and Dunja Mladenic. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001).

Using Error-Correcting Codes for Text Classification. Rayid Ghani. Proceedings of the 17th International Conference on Machine Learning (ICML 2000).

Analyzing the Effectiveness and Applicability of Co-Training. Kamal Nigam & Rayid Ghani.  Proceedings of  the Ninth International Conference on Information and Knowledge Management (CIKM 2000).

Understanding the Behavior of Co-Training. Kamal Nigam & Rayid Ghani. Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000).

Learning a Monolingual Language Model from a Multilingual Text Database. Rayid Ghani & Rosie Jones. Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000).

Automatically Building a Corpus for a Minority Language From the Web. Rosie Jones & Rayid Ghani. Proceedings of the Student Workshop at the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000).

Data Mining on Symbolic Knowledge Extracted from the Web. Rayid Ghani, Rosie Jones, Dunja Mladenic, Kamal Nigam, Sean Slattery. Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000).

 

Research Community Activities:

Member of the Advisory Board for the European Union Project on Semantically Enabled Knowledge Technologies, 2004-2006

Invited Speaker: Web Mining Workshop at European Conference on Machine Learning & Principles of Data Mining (ECML/PKDDD 2003)

Organizer, Workshop on “The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining” – held with International Conference on Machine Learning, 2003

Organizer, Workshop on Operational Text Classification – held with Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003

 

Program Committee Member

ACM Conference on Research and Development in Information Retrieval (SIGIR 2004)

International Conference on Machine Learning ICML – 2003 and 2004

Link Discovery Workshop (LinkKDD)  held with Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004

Web Mining Workshop (WebKDD)  held with Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004

Adaptive Text Extraction & Mining Workshop at ECML 2003

Text Learning Workshop at the International Conference on Machine Learning (ICML 2002)

Operational Text Classification Workshop at the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002)

Text Mining Workshop at the IEEE Conference on Data Mining, 2001.

Reviewer

European Conference on Artificial Intelligence (ECAI 2002)

International Conference on Machine Learning (ICML 2001).

Journal of Artificial Intelligence Research (JAIR)

Journal of Machine Learning Research (JMLR)

 

Education:

Center for Automated Learning & Discovery, Carnegie Mellon University, Pittsburgh, PA

M.S. in Knowledge Discovery & Data Mining May 2001

Advisor: Tom Mitchell

Coursework in Machine Learning, Text Mining, Information Retrieval, Advanced Information Retrieval Seminar, Data Mining in Multimedia Databases, Statistical Approaches for Learning & Discovery, Advanced AI Concepts

 

University of the South, Sewanee, TN

B.S. (with Honors) Summa Cum Laude, May 1999.

Majors: Computer Science, Mathematics.

 

Research Experience:

 

Researcher, Accenture Technology Labs, Accenture, Chicago IL                                                           July 2001 - Present

Member of the Research Group at Accenture Technology Labs. Conducting research in the areas of Machine Learning, Data Mining and Text Mining. The Labs’ goals include researching and inventing the next wave of business solutions using new and emerging technologies and exploring how these will evolve, converge and shape businesses in the future. Recent projects include automatically constructing knowledge bases of product semantics using text learning techniques, semi-supervised information extraction and active learning., constructing individual consumer models from massive transaction databases for personalized interactions in retail domain, devevloping analytical tools for mining online marketplaces.       

 

Researcher, Center for Automated Learning & Discovery, Carnegie Mellon University, Pittsburgh, PA. June 2001 – August 2001

Researched the use of  Machine Learning techniques for Intrusion Detection. Joint work with Roy Maxion, Director of the Dependable Systems Lab, Carnegie Mellon University. Also worked on Web Log Analysis using statistical methods.

 

Researcher - Text Mining & Computational Linguistics Group, IBM T.J. Watson Research Center June - August 2000

Research in text classification and learning with unlabeled data.

 

Graduate Research – WebKB Project, Center for Automated Learning & Discovery, Carnegie Mellon University, Pittsburgh, PA. August 1999 – May 2001

Performed research in the areas of statistical text learning, learning with unlabeled data, and information extraction. Part of the WebKB group whose aim was to build a web-crawling system that extracts information from the Web into a propositional knowledge base, using a variety of learned text classifiers and information extractors. Research funded by DARPA and CIA.

 

Graduate Research– fMRI Brain Imaging Analysis Project, Center for Automated Learning & Discovery, Carnegie Mellon University, Pittsburgh, PA. August December 1999

Worked on developing and using Machine Learning Algorithms to analyze fMRI images of the human brain in collaboration with Center for Cognitive Brain Imaging at Carnegie Mellon University.

 

Society Memberships:

Member, American Association for Artificial Intelligence (AAAI)

Member, Association for Computing Machinery (ACM)

Member, IEEE

Member, Graduate Admissions Committee, Center for Automated Learning & Discovery ,Carnegie Mellon University 2000, 2001.