next up previous
Next: About this document ... Up: SMOTE: Synthetic Minority Over-sampling Previous: Appendix A. ROC graphs


F. Provost and T. Fawcett, ``Robust Classification for Imprecise Environments,'' Machine Learning, vol. 42/3, pp. 203-231, 2001.

T. Fawcett and F. Provost, ``Combining Data Mining and Machine Learning for Effective User Profile,'' in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, (Portland, OR), pp. 8-13, AAAI, 1996.

J. Ezawa, K., M. Singh, and W. Norton, S., ``Learning Goal Oriented Bayesian Networks for Telecommunications Risk Management,'' in Proceedings of the International Conference on Machine Learning, ICML-96, (Bari, Italy), pp. 139-147, Morgan Kauffman, 1996.

D. Lewis and J. Catlett, ``Heterogeneous Uncertainity Sampling for Supervised Learning,'' in Proceedings of the Eleventh International Conference of Machine Learning, (San Francisco, CA), pp. 148-156, Morgan Kaufmann, 1994.

S. Dumais, J. Platt, D. Heckerman, and M. Sahami, ``Inductive Learning Algorithms and Representations for Text Categorization,'' in Proceedings of the Seventh International Conference on Information and Knowledge Management., pp. 148-155, 1998.

D. Mladenic and M. Grobelnik, ``Feature Selection for Unbalanced Class Distribution and Naive Bayes,'' in Proceedings of the 16th International Conference on Machine Learning., pp. 258-267, Morgan Kaufmann, 1999.

D. Lewis and M. Ringuette, ``A Comparison of Two Learning Algorithms for Text Categorization,'' in Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 81-93, 1994.

W. Cohen, ``Learning to Classify English Text with ILP Methods,'' in Proceedings of the 5th International Workshop on Inductive Logic Programming, pp. 3-24, Department of Computer Science, Katholieke Universiteit Leuven, 1995.

M. Kubat, R. Holte, and S. Matwin, ``Machine Learning for the Detection of Oil Spills in Satellite Radar Images,'' Machine Learning, vol. 30, pp. 195-215, 1998.

K. Woods, C. Doss, K. Bowyer, J. Solka, C. Priebe, and P. Kegelmeyer, ``Comparative Evaluation of Pattern Recognition Techniques for Detection of Microcalcifications in Mammography,'' International Journal of Pattern Recognition and Artificial Intelligence, vol. 7(6), pp. 1417-1436, 1993.

J. Swets, ``Measuring the Accuracy of Diagnostic Systems,'' Science, vol. 240, pp. 1285-1293, 1988.

R. Duda, P. Hart, and D. Stork, Pattern Classification.
Wiley-Interscience, 2001.

A. P. Bradley, ``The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms,'' Pattern Recognition, vol. 30(6), pp. 1145-1159, 1997.

S. Lee, ``Noisy Replication in Skewed Binary Classification,'' Computational Statistics and Data Analysis, vol. 34, 2000.

M. Pazzani, C. Merz, P. Murphy, K. Ali, T. Hume, and C. Brunk, ``Reducing Misclassification Costs,'' in Proceedings of the Eleventh International Conference on Machine Learning, (San Francisco, CA), Morgan Kauffmann, 1994.

P. Domingos, ``Metacost: A General Method for Making Classifiers Cost-sensitive,'' in Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (San Diego, CA), pp. 155-164, ACM Press, 1999.

M. Kubat and S. Matwin, ``Addressing the Curse of Imbalanced Training Sets: One Sided Selection,'' in Proceedings of the Fourteenth International Conference on Machine Learning, (Nashville, Tennesse), pp. 179-186, Morgan Kaufmann, 1997.

N. Japkowicz, ``The Class Imbalance Problem: Significance and Strategies,'' in Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI'2000): Special Track on Inductive Learning, (Las Vegas, Nevada), 2000.

C. Ling and C. Li, ``Data Mining for Direct Marketing Problems and Solutions,'' in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), (New York, NY), AAAI Press, 1998.

N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer, ``SMOTE: Synthetic Minority Over-sampling TEchnique,'' in International Conference of Knowledge Based Computer Systems, pp. 46-57, National Center for Software Technology, Mumbai, India, Allied Press, 2000.

J. Quinlan, C4.5: Programs for Machine Learning.
San Mateo, CA: Morgan Kaufmann, 1992.

W. W. Cohen, ``Fast Effective Rule Induction,'' in Proc. 12th International Conference on Machine Learning, (Lake Tahoe, CA), pp. 115-123, Morgan Kaufmann, 1995.

C. Drummond and R. Holte, ``Explicitly Representing Expected Cost: An Alternative to ROC Representation,'' in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (Boston), pp. 198-207, ACM, 2000.

P. Turney, ``Cost Sensitive Bibliography.'', 1996.

F. Provost, T. Fawcett, and R. Kohavi, ``The Case Against Accuracy Estimation for Comparing Induction Algorithms,'' in Proceedings of the Fifteenth International Conference on Machine Learning, (Madison, WI), pp. 445-453, Morgan Kauffmann, 1998.

I. Tomek, ``Two Modifications of CNN,'' IEEE Transactions on Systems, Man and Cybernetics, vol. 6, pp. 769-772, 1976.

A. Solberg and R. Solberg, ``A Large-Scale Evaluation of Features for Automatic Detection of Oil Spills in ERS SAR Images,'' in International Geoscience and Remote Sensing Symposium, (Lincoln, NE), pp. 1484-1486, 1996.

E. DeRouin, J. Brown, L. Fausett, and M. Schneider, ``Neural Network Training on Unequally Represented Classes,'' in Intellligent Engineering Systems Through Artificial Neural Networks, (New York), pp. 135-141, ASME Press, 1991.

C. van Rijsbergen, D. Harper, and M. Porter, ``The Selection of Good Search Terms,'' Information Processing and Management, vol. 17, pp. 77-91, 1981.

T. M. Ha and H. Bunke, ``Off-line, Handwritten Numeral Recognition by Perturbation Method,'' Pattern Analysis and Machine Intelligence, vol. 19/5, pp. 535-539, 1997.

W. W. Cohen and Y. Singer, ``Context-sensitive Learning Methods for Text Categorization,'' in Proceedings of SIGIR-96, 19th ACM International Conference on Research and Development in Information Retrieval (H.-P. Frei, D. Harman, P. Schäuble, and R. Wilkinson, eds.), (Zürich, CH), pp. 307-315, ACM Press, New York, US, 1996.

C. Blake and C. Merz, ``UCI Repository of Machine Learning Databases$\sim$mlearn/$\sim$MLRepository.html.'' Department of Information and Computer Sciences, University of California, Irvine, 1998.

L. Hall, B. Mohney, and L. Kier, ``The Electrotopological State: Structure Information at the Atomic Level for Molecular Graphs,'' Journal of Chemical Information and Computer Science, vol. 31, no. 76, 1991.

N. Chawla and L. Hall, ``Modifying MUSTAFA to capture salient data,'' Tech. Rep. ISL-99-01, University of South Florida, Computer Science and Eng. Dept., 1999.

J. O'Rourke, Computational Geometry in C.
UK: Cambridge University Press, 1998.

C. Stanfill and D. Waltz, ``Toward Memory-based Reasoning,'' Communications of the ACM, vol. 29, no. 12, pp. 1213-1228, 1986.

S. Cost and S. Salzberg, ``A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features,'' Machine Learning, vol. 10, no. 1, pp. 57-78, 1993.

Nitesh Chawla (CS)