Challenges in the Practical Application of Machine Learning Prof. Carla E. Brodley Department of Computer Science Tufts University In this talk I will discuss the factors that impact the successful application of supervised machine learning. Driven by several interdisciplinary collaborations, we are addressing the problem of what to do when your your initial accuracy is lower than is acceptable to your domain experts. Low accuracy can be due to three factors: noise in the class labels, insufficient training data, and whether the features describing each training example are able to discriminate the classes. In this talk, I will discuss research efforts at Tufts addressing the second two factors. The first project, introduces a new problem which we have named active class selection (ACS). ACS arises when one can ask the question: given the ability to collect n additional training instances, how should they be distributed with respect to class? The second project examines how one might assess that the class distinctions are not supported by the features and how constraint-based clustering can be used to uncover the true class structure of the data. These two issues and their solutions will be explored in the context of three applications. The first is to create a map of global map of the land cover of the Earth's surface from remotely sensed data (satellite data). The second is to build a classifier based on data collected from an "artificial nose" to discriminate vapors. The "nose" is a collection of sensors that have different reactions to different vapors. The third is to classify HRCT images of the lung. Bio: Carla E. Brodley is a professor in the Department of Computer Science at Tufts University. She received her PhD in computer science from the University of Massachusetts, at Amherst in 1994. From 1994-2004, she was on the faculty of the School of Electrical Engineering at Purdue University. Professor Brodley's research interests include machine learning, knowledge discovery in databases, and computer security. She has worked in the areas of anomaly detection, active learning, classifier formation, unsupervised learning, and applications of machine learning to remote sensing, computer security, digital libraries, astrophysics, content-based image retrieval of medical images, computational biology, saliva diagnostics, evidence-based medicine and chemistry. She was a member of the DSSG in 2004-2005. In 2001 she served as program co-chair for the International Conference on Machine Learning (ICML) and in 2004, she served as the general chair for ICML. Currently she is an associate editor of JMLR and Machine Learning, and she is on the editorial board of DKMD. She is a member of the AAAI Council and is co-chair of the Computing Research Association's Committee on the Status of Women in Computing Research (CRA-W).