 
 
 
 
 
   
 Next: About this document ...
 Up: SMOTE: Synthetic Minority Over-sampling
 Previous: Appendix A. ROC graphs
- 1
- 
F. Provost and T. Fawcett, ``Robust Classification for Imprecise
  Environments,'' Machine Learning, vol. 42/3, pp. 203-231, 2001.
- 2
- 
T. Fawcett and F. Provost, ``Combining Data Mining and Machine
  Learning for Effective User Profile,'' in Proceedings of the 2nd
  International Conference on Knowledge Discovery and Data Mining, (Portland,
  OR), pp. 8-13, AAAI, 1996.
- 3
- 
J. Ezawa, K., M. Singh, and W. Norton, S., ``Learning Goal Oriented
  Bayesian Networks for Telecommunications Risk Management,'' in 
  Proceedings of the International Conference on Machine Learning, ICML-96,
  (Bari, Italy), pp. 139-147, Morgan Kauffman, 1996.
- 4
- 
D. Lewis and J. Catlett, ``Heterogeneous Uncertainity Sampling for
  Supervised Learning,'' in Proceedings of the Eleventh International
  Conference of Machine Learning, (San Francisco, CA), pp. 148-156, Morgan
  Kaufmann, 1994.
- 5
- 
S. Dumais, J. Platt, D. Heckerman, and M. Sahami, ``Inductive Learning
  Algorithms and Representations for Text Categorization,'' in 
  Proceedings of the Seventh International Conference on Information and
  Knowledge Management., pp. 148-155, 1998.
- 6
- 
D. Mladenic and M. Grobelnik, ``Feature Selection for Unbalanced
  Class Distribution and Naive Bayes,'' in Proceedings of the 16th
  International Conference on Machine Learning., pp. 258-267, Morgan
  Kaufmann, 1999.
- 7
- 
D. Lewis and M. Ringuette, ``A Comparison of Two Learning Algorithms
  for Text Categorization,'' in Proceedings of SDAIR-94, 3rd Annual
  Symposium on Document Analysis and Information Retrieval, pp. 81-93, 1994.
- 8
- 
W. Cohen, ``Learning to Classify English Text with ILP Methods,'' in
  Proceedings of the 5th International Workshop on Inductive Logic
  Programming, pp. 3-24, Department of Computer Science, Katholieke
  Universiteit Leuven, 1995.
- 9
- 
M. Kubat, R. Holte, and S. Matwin, ``Machine Learning for the Detection of
  Oil Spills in Satellite Radar Images,'' Machine Learning,
  vol. 30, pp. 195-215, 1998.
- 10
- 
K. Woods, C. Doss, K. Bowyer, J. Solka, C. Priebe, and P. Kegelmeyer,
  ``Comparative Evaluation of Pattern Recognition Techniques for
  Detection of Microcalcifications in Mammography,'' International
  Journal of Pattern Recognition and Artificial Intelligence, vol. 7(6),
  pp. 1417-1436, 1993.
- 11
- 
J. Swets, ``Measuring the Accuracy of Diagnostic Systems,'' 
  Science, vol. 240, pp. 1285-1293, 1988.
- 12
- 
R. Duda, P. Hart, and D. Stork, Pattern Classification.
 Wiley-Interscience, 2001.
- 13
- 
A. P. Bradley, ``The Use of the Area Under the ROC Curve in the
  Evaluation of Machine Learning Algorithms,'' Pattern
  Recognition, vol. 30(6), pp. 1145-1159, 1997.
- 14
- 
S. Lee, ``Noisy Replication in Skewed Binary Classification,'' 
  Computational Statistics and Data Analysis, vol. 34, 2000.
- 15
- 
M. Pazzani, C. Merz, P. Murphy, K. Ali, T. Hume, and C. Brunk, ``Reducing
  Misclassification Costs,'' in Proceedings of the Eleventh
  International Conference on Machine Learning, (San Francisco, CA), Morgan
  Kauffmann, 1994.
- 16
- 
P. Domingos, ``Metacost: A General Method for Making Classifiers
  Cost-sensitive,'' in Proceedings of the Fifth ACM SIGKDD International
  Conference on Knowledge Discovery and Data Mining, (San Diego, CA),
  pp. 155-164, ACM Press, 1999.
- 17
- 
M. Kubat and S. Matwin, ``Addressing the Curse of Imbalanced Training
  Sets: One Sided Selection,'' in Proceedings of the Fourteenth
  International Conference on Machine Learning, (Nashville, Tennesse),
  pp. 179-186, Morgan Kaufmann, 1997.
- 18
- 
N. Japkowicz, ``The Class Imbalance Problem: Significance and
  Strategies,'' in Proceedings of the 2000 International Conference on
  Artificial Intelligence (IC-AI'2000): Special Track on Inductive Learning,
  (Las Vegas, Nevada), 2000.
- 19
- 
C. Ling and C. Li, ``Data Mining for Direct Marketing Problems and
  Solutions,'' in Proceedings of the Fourth International Conference on
  Knowledge Discovery and Data Mining (KDD-98), (New York, NY), AAAI Press,
  1998.
- 20
- 
N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer, ``SMOTE: Synthetic
  Minority Over-sampling TEchnique,'' in International Conference of
  Knowledge Based Computer Systems, pp. 46-57, National Center for Software
  Technology, Mumbai, India, Allied Press, 2000.
- 21
- 
J. Quinlan, C4.5: Programs for Machine Learning.
 San Mateo, CA: Morgan Kaufmann, 1992.
- 22
- 
W. W. Cohen, ``Fast Effective Rule Induction,'' in Proc. 12th
  International Conference on Machine Learning, (Lake Tahoe, CA),
  pp. 115-123, Morgan Kaufmann, 1995.
- 23
- 
C. Drummond and R. Holte, ``Explicitly Representing Expected Cost: An
  Alternative to ROC Representation,'' in Proceedings of the Sixth
  ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
  (Boston), pp. 198-207, ACM, 2000.
- 24
- 
P. Turney, ``Cost Sensitive Bibliography.''
  http://ai.iit.nrc.ca/bibiliographies/cost-sensitive.html, 1996.
- 25
- 
F. Provost, T. Fawcett, and R. Kohavi, ``The Case Against Accuracy
  Estimation for Comparing Induction Algorithms,'' in Proceedings
  of the Fifteenth International Conference on Machine Learning, (Madison,
  WI), pp. 445-453, Morgan Kauffmann, 1998.
- 26
- 
I. Tomek, ``Two Modifications of CNN,'' IEEE Transactions on Systems,
  Man and Cybernetics, vol. 6, pp. 769-772, 1976.
- 27
- 
A. Solberg and R. Solberg, ``A Large-Scale Evaluation of Features for
  Automatic Detection of Oil Spills in ERS SAR Images,'' in 
  International Geoscience and Remote Sensing Symposium, (Lincoln, NE),
  pp. 1484-1486, 1996.
- 28
- 
E. DeRouin, J. Brown, L. Fausett, and M. Schneider, ``Neural Network
  Training on Unequally Represented Classes,'' in Intellligent
  Engineering Systems Through Artificial Neural Networks, (New York),
  pp. 135-141, ASME Press, 1991.
- 29
- 
C. van Rijsbergen, D. Harper, and M. Porter, ``The Selection of Good
  Search Terms,'' Information Processing and Management, vol. 17,
  pp. 77-91, 1981.
- 30
- 
T. M. Ha and H. Bunke, ``Off-line, Handwritten Numeral Recognition by
  Perturbation Method,'' Pattern Analysis and Machine Intelligence,
  vol. 19/5, pp. 535-539, 1997.
- 31
- 
W. W. Cohen and Y. Singer, ``Context-sensitive Learning Methods for Text
  Categorization,'' in Proceedings of SIGIR-96, 19th ACM
  International Conference on Research and Development in Information
  Retrieval (H.-P. Frei, D. Harman, P. Schäuble, and R. Wilkinson,
  eds.), (Zürich, CH), pp. 307-315, ACM Press, New York, US, 1996.
- 32
- 
C. Blake and C. Merz, ``UCI Repository of Machine Learning Databases
  http://www.ics.uci.edu/ mlearn/ mlearn/ MLRepository.html.'' Department
  of Information and Computer Sciences, University of California,
  Irvine, 1998. MLRepository.html.'' Department
  of Information and Computer Sciences, University of California,
  Irvine, 1998.
- 33
- 
L. Hall, B. Mohney, and L. Kier, ``The Electrotopological State:
  Structure Information at the Atomic Level for Molecular Graphs,''
  Journal of Chemical Information and Computer Science, vol. 31, no. 76,
  1991.
- 34
- 
N. Chawla and L. Hall, ``Modifying MUSTAFA to capture salient data,'' Tech.
  Rep. ISL-99-01, University of South Florida, Computer Science and Eng. Dept.,
  1999.
- 35
- 
J. O'Rourke, Computational Geometry in C.
 UK: Cambridge University Press, 1998.
- 36
- 
C. Stanfill and D. Waltz, ``Toward Memory-based Reasoning,'' 
  Communications of the ACM, vol. 29, no. 12, pp. 1213-1228, 1986.
- 37
- 
S. Cost and S. Salzberg, ``A Weighted Nearest Neighbor Algorithm for
  Learning with Symbolic Features,'' Machine Learning, vol. 10,
  no. 1, pp. 57-78, 1993.
Nitesh Chawla (CS)
6/2/2002