;             Rayid Ghani
                                Graduate Student
                                M.S. Program in Knowledge Discovery & Data Mining
                                Center for Automated Learning & Discovery,
                                School of Computer Science,
                                Carnegie Mellon University

                Advisor: Tom Mitchell

Research Interests:
Machine Learning and its applications to real-world problems. 
My specific interest is in using Machine Learning methods for Text Classification and Information Extraction and  I am also interested in Data Mining in general and Web Mining in particular.
I am a member of the Text Learning Group and the WebKB Group at Carnegie Mellon University.



Data Mining:

Using Text Mining to Infer Semantic Attributes for Retail Data Mining . Rayid Ghani and Andrew Fano  
IEEE International Conference on Data Mining (ICDM 2002)

Building Recommender Systems Using a Knowledge Base of Product Semantics . Rayid Ghani and Andrew Fano  
Workshop on Recommendation and Personalization in ECommerce (RPEC 2002) at the Second International Conference on Adaptive Hypermedia and Adaptive Web-based Systems (AH 2002)

Data Mining on Symbolic Knowledge Extracted from the Web. Rayid Ghani, Rosie Jones, Dunja Mladenic, Kamal Nigam, Sean Slattery  
Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000).

Text Classification:

Using Error-Correcting Codes for Efficient Text Classification with a Large Number of Categories. Rayid Ghani. Masters Thesis. Center for Automated Learning & Discovery, Carnegie Mellon University (2001)

A Study of Approaches for Hypertext Categorization. Yiming Yang, Sean Slattery and Rayid Ghani.
Journal of Intelligent Information systems - Special Issue on Automatic Text Categorization (2002).

Hypertext Categorization using Hyperlink Patterns and Meta Data. Rayid Ghani, Sean Slattery, and Yiming Yang. 
Proceedings of the 18th International Conference on Machine Learning (ICML 2001).

Using Error-Correcting Codes for Text Classification. Rayid Ghani. 
Proceedings of the 17th International Conference on Machine Learning (ICML 2000).

Combining Labeled and Unlabeled Data:

A Comparison of Efficacy and Assumptions of Bootstrapping Algorithms for Training Information Extraction Systems . Rayid Ghani and Rosie Jones.
Workshop on Linguistic Knowledge Acquisition and Representation at the Third International Conference on Language Resources and Evaluation (LREC 2002).

Combining Labeled and Unlabeled data for Multiclass Text Classification. Rayid Ghani. Proceedings of the 19th Internaltional Conference on Machine Learning (ICML 2002)

Combining Labeled and Unlabeled data for Text Classification with a Large Number of Categories. Rayid Ghani. IEEE Conference on Data Mining (2001)

Using Error-correcting codes with co-training for text classification with a large number of categories. . Rayid Ghani. Workshop on Text Mining at the IEEE Conference on Data Mining (2001) (expanded version at ICML 2002)

Analyzing the Effectiveness and Applicability of Co-Training. Kamal Nigam & Rayid Ghani. 
Proceedings of  the Ninth International Conference on Information and Knowledge Management (CIKM 2000).

Understanding the Behavior of Co-Training. Kamal Nigam & Rayid Ghani. 
Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000).

Web Mining to Create Minority Language Corpora:

Building Minority Language Corpora by Learning to Generate Web Search Queries. Rayid Ghani, Rosie Jones, and Dunja Mladenic
Journal of Knowledge and Information Systems (2003)

Building Minority Language Corpora by Learning to Generate Web Search Queries. Rayid Ghani, Rosie Jones, and Dunja Mladenic
Carnegie Mellon University Center for Automated Learning and Discovery Technical Report CMU-CALD-01-100 (2001)

Using the Web to Create Minority Language Corpora. Rayid Ghani, Rosie Jones, and Dunja Mladenic
Proceedings of the Tenth International Conference on Information and Knowledge Management (CIKM 2001).

Online Learning for Query Generation: Finding Documents Matching a Minority Concept on the Web. Rayid Ghani, Rosie Jones, and Dunja Mladenic
Proceedings of the First International Conference on Web Intelligence (2001).

Automatic Web Search Query Generation to Create Minority Language Corpora.. Rayid Ghani, Rosie Jones, and Dunja Mladenic
Poster Paper in the Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001).

Learning a Monolingual Language Model from a Multilingual Text Database. Rayid Ghani & Rosie Jones. 
Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000).

Automatically Building a Corpus for a Minority Language From the Web. Rosie Jones & Rayid Ghani. 
Proceedings of the Student Workshop at the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000).

Contact Information (finger me for current information)
Office: 161 N. Clark St., Chicago, IL 60601  (312) 693-6653 

                Recent Projects & Talks

                Academic History

                Resumé [Postscript Version][PDF Version]



       &nb$ My Photography Website

                Interesting Links

`               Last Updated: 20-Feb-2003 12:20:17