sachin agarwal

Language Technologies Institute
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213 United States

Experiment Tools

Linear regression [Applet] SVMs [Applets]
Naive Bayes [Applet] PAC learning [Applets]
Logistic regression [Applet] Java Bayes [Applet]
Discriminative v. Generative models [Applet] K-Means [Applet]
Decision trees [Applet] Mixture of Gausians [Applet]
Boosting [Adaboost Applet] PCA [Applet]
Instance-based learning [Applet] Reinforcement Learning [RL Sim Applet]
EM for estimating Gaussian mixtures  

Conference Proceedings

  • Text REtrieval Conference (TREC)
  • SIGIR
  • Digital Libraries (DL)
  • Conference on Information and Knowledge Management (CIKM)
  • World Wide Web

Datasets

  • Wikipedia database [English] [French] [German]
  • University of California Irvine Machine Learning Repository
  • Text REtrieval Conference (TREC)
  • Reuters-21578 Dataset
  • International Conference of Weblogs and Social Media Dataset
  • NIST
  • The Linguistic Data Consortium
  • LDC resources at CMU
  • The Rosetta Project

Online Books

  • P. Ingwersen. Information Retrieval Interaction. London: Taylor Graham, 1992.
  • C. J. van Rijsbergen. Information Retrieval. London: Butterworths, 1979

Software and Tools

  • The Porter stemmer
  • Martin Porter's Porter algorithm Web page
  • MXTERMINATOR English sentence boundary detector
  • MXPOST English part of speech tagger
  • Language ID tools
  • R - free software environment for statistical computing and graphics

 

  • Menu

    • Home
    • Resume
    • Photos
    • Datasets and Tools
    • Publications
    • Contact
  • misc

    • My Schedule
    • My CiteULike Link
    • Fun and Games
    • Ambrosia Blog
  • Search

    only search this website
 
 
 

Sachin Agarwal © 2007.

Language Technologies Institute | School of Computer Science

Carnegie Mellon University

Pittsburgh PA 15213 | United States

Page last updated: March 30, 2008