MIME-Version: 1.0 Server: CERN/3.0 Date: Tuesday, 07-Jan-97 15:56:08 GMT Content-Type: text/html Content-Length: 92952 Last-Modified: Wednesday, 16-Oct-96 15:32:32 GMT Machine Learning Papers and Abstracts

Machine Learning Papers and Abstracts

To view a paper, click on the open book image.

  1. Inductive Logic Programming for Natural Language Processing Raymond J. Mooney
    Proceedings of the 6th International Inductive Logic Programming Workshop, pp. 205-224, Stockholm, Sweden, August 1996.

    This paper reviews our recent work on applying inductive logic programming to the construction of natural language processing systems. We have developed a system, CHILL, that learns a parser from a training corpus of parsed sentences by inducing heuristics that control an initial overly-general shift-reduce parser. CHILL learns syntactic parsers as well as ones that translate English database queries directly into executable logical form. The ATIS corpus of airline information queries was used to test the acquisition of syntactic parsers, and CHILL performed competitively with recent statistical methods. English queries to a small database on U.S. geography were used to test the acquisition of a complete natural language interface, and the parser that CHILL acquired was more accurate than an existing hand-coded system. The paper also includes a discussion of several issues this work has raised regarding the capabilities and testing of ILP systems as well as a summary of our current research directions.

  2. Combining Symbolic and Connectionist Learning Methods to Refine Certainty-Factor Rule-Bases
    J. Jeffrey Mahoney
    Ph.D. Thesis, Department of Computer Sciences, University of Texas at Austin, May, 1996.

    This research describes the system RAPTURE, which is designed to revise rule bases expressed in certainty-factor format. Recent studies have shown that learning is facilitated when biased with domain-specific expertise, and have also shown that many real-world domains require some form of probabilistic or uncertain reasoning in order to successfully represent target concepts. RAPTURE was designed to take advantage of both of these results.

    Beginning with a set of certainty-factor rules, along with accurately-labelled training examples, RAPTURE makes use of both symbolic and connectionist learning techniques for revising the rules, in order that they correctly classify all of the training examples. A modified version of backpropagation is used to adjust the certainty factors of the rules, ID3's information-gain heuristic is used to add new rules, and the Upstart algorithm is used to create new hidden terms in the rule base.

    Results on refining four real-world rule bases are presented that demonstrate the effectiveness of this combined approach. Two of these rule bases were designed to identify particular areas in strands of DNA, one is for identifying infectious diseases, and the fourth attempts to diagnose soybean diseases. The results of RAPTURE are compared with those of backpropagation, C4.5, KBANN, and other learning systems. RAPTURE generally produces sets of rules that are more accurate that these other systems, often creating smaller sets of rules and using less training time.

  3. Integrating EBL and ILP to Acquire Control Rules for Planning
    Tara A. Estlin and Raymond J. Mooney
    Proceedings of the Third International Workshop on Multi-Strategy Learning, pp. 271-279, Harpers Ferry, WV, May 1996. (MSL-96).

    Most approaches to learning control information in planning systems use explanation-based learning to generate control rules. Unfortunately, EBL alone often produces overly complex rules that actually decrease planning efficiency. This paper presents a novel learning approach for control knowledge acquisition that integrates explanation-based learning with techniques from inductive logic programming. EBL is used to constrain an inductive search for selection heuristics that help a planner choose between competing plan refinements. SCOPE is one of the few systems to address learning control information in the newer partial-order planners. Specifically, SCOPE learns domain-specific control rules for a version of the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains.

  4. Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning
    Raymond J. Mooney
    Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing, pp. 82-91, Philadelphia, PA, May 1996..

    This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.

  5. Learning to Parse Database Queries using Inductive Logic Programming
    John M. Zelle and Raymond J. Mooney
    Proceedings of the Thirteenth National Conference on Aritificial Intelligence, pp. 1050-1055, Portland, OR, August, 1996. (AAAI-96)

    This paper presents recent work using the CHILL parser acquisition system to automate the construction of a natural-language interface for database queries. CHILL treats parser acquisition as the learning of search-control rules within a logic program representing a shift-reduce parser and uses techniques from Inductive Logic Programming to learn relational control knowledge. Starting with a general framework for constructing a suitable logical form, CHILL is able to train on a corpus comprising sentences paired with database queries and induce parsers that map subsequent sentences directly into executable queries. Experimental results with a complete database-query application for U.S. geography show that CHILL is able to learn parsers that outperform a pre-existing, hand-crafted counterpart. These results demonstrate the ability of a corpus-based system to produce more than purely syntactic representations. They also provide direct evidence of the utility of an empirical approach at the level of a complete natural language application.

  6. A Novel Application of Theory Refinement to Student Modeling
    Paul Baffes and Raymond J. Mooney
    Proceedings of the Thirteenth National Conference on Aritificial Intelligence, pp. 403-408, Portland, OR, August, 1996. (AAAI-96)

    Theory refinement systems developed in machine learning automatically modify a knowledge base to render it consistent with a set of classified training examples. We illustrate a novel application of these techniques to the problem of constructing a student model for an intelligent tutoring system (ITS). Our approach is implemented in an ITS authoring system called Assert which uses theory refinement to introduce errors into an initially correct knowledge base so that it models incorrect student behavior. The efficacy of the approach has been demonstrated by evaluating a tutor developed with Assert with 75 students tested on a classification task covering concepts from an introductory course on the C++ programming language. The system produced reasonably accurate models and students who received feedback based on these models performed significantly better on a post test than students who received simple reteaching.

  7. Qualitative Multiple-Fault Diagnosis of Continuous Dynamic Systems Using Behavioral Modes
    Siddarth Subramanian and Raymond J. Mooney
    Proceedings of the Thirteenth National Conference on Aritificial Intelligence, pp. 965-970, Portland, OR, August, 1996. (AAAI-96)

    Most model-based diagnosis systems, such as GDE and Sherlock, have concerned discrete, static systems such as logic circuits and use simple constraint propagation to detect inconsistencies. However, sophisticated systems such as QSIM and QPE have been developed for qualitative modeling and simulation of continuous dynamic systems. We present an integration of these two lines of research as implemented in a system called QDOCS for multiple-fault diagnosis of continuous dynamic systems using QSIM models. The main contributions of the algorithm include a method for propagating dependencies while solving a general constraint satisfaction problem and a method for verifying the consistency of a behavior with a model across time. Through systematic experiments on two realistic engineering systems, we demonstrate that QDOCS demonstrates the best balance of generality, accuracy, and efficiency among competing methods.

  8. Multi-Strategy Learning of Search Control for Partial-Order Planning
    Tara A. Estlin and Raymond J. Mooney
    Proceedings of the Thirteenth National Conference on Aritificial Intelligence, pp. 843-848, Portland, OR, August, 1996. (AAAI-96)

    Most research in planning and learning has involved linear, state-based planners. This paper presents SCOPE, a system for learning search-control rules that improve the performance of a partial-order planner. SCOPE integrates explanation-based and inductive learning techniques to acquire control rules for a partial-order planner. Learned rules are in the form of selection heuristics that help the planner choose between competing plan refinements. Specifically, SCOPE learns domain-specific control rules for a version of the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains.

  9. Integrating Explanation-Based and Inductive Learning Techniques to Acquire Search-Control for Planning
    Tara A. Estlin
    Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin, 1996.

    Planning systems have become an important tool for automating a wide variety of tasks. Control knowledge guides a planner to find solutions quickly and is crucial for efficient planning in most domains. Machine learning techniques enable a planning system to automatically acquire domain-specific search-control knowledge for different applications. Past approaches to learning control information have usually employed explanation-based learning (EBL) to generate control rules. Unfortunately, EBL alone often produces overly complex rules that actually decrease rather than improve overall planning efficiency. This paper presents a novel learning approach for control knowledge acquisition that integrates explanation-based learning with techniques from inductive logic programming. In our learning system SCOPE, EBL is used to constrain an inductive search for control heuristics that help a planner choose between competing plan refinements. SCOPE is one of the few systems to address learning control information for newer, partial-order planners. Specifically, this proposal describes how SCOPE learns domain-specific control rules for the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains, and to be more effective than a pure EBL approach.

    Future research will be performed in three main areas. First, SCOPE's learning algorithm will be extended to include additional techniques such as constructive induction and rule utility analysis. Second, SCOPE will be more thoroughly tested; several real-world planning domains have been identified as possible testbeds, and more in-depth comparisons will be drawn between SCOPE and other competing approaches. Third, SCOPE will be implemented in a different planning system in order to test its portability to other planning algorithms. This work should demonstrate that machine-learning techniques can be a powerful tool in the quest for tractable real-world planning.

  10. Lexical Acquisition: A Novel Machine Learning Problem
    Cynthia A. Thompson and Raymond J. Mooney
    Technical Report, Artificial Intelligence Lab, University of Texas at Austin, 1996.

    This paper defines a new machine learning problem to which standard machine learning algorithms cannot easily be applied. The problem occurs in the domain of lexical acquisition. The ambiguous and synonymous nature of words causes the difficulty of using standard induction techniques to learn a lexicon. Additionally, negative examples are typically unavailable or difficult to construct in this domain. One approach to solve the lexical acquisition problem is presented, along with preliminary experimental results on an artificial corpus. Future work includes extending the algorithm and performing tests on a more realistic corpus.

  11. Advantages of Decision Lists and Implicit Negative in Inductive Logic Programming
    Mary Elaine Califf and Raymond J. Mooney
    Technical Report, Artificial Intelligence Lab, University of Texas at Austin, 1996.

    This paper demonstrates the capabilities of FOIDL, an inductive logic programming (ILP) system whose distinguishing characteristics are the ability to produce first-order decision lists, the use of an output completeness assumption to provide implicit negative examples, and the use of intensional background knowledge. The development of FOIDL was originally motivated by the problem of learning to generate the past tense of English verbs; however, this paper demonstrates its superior performance on two different sets of benchmark ILP problems. Tests on the finite element mesh design problem show that FOIDL's decision lists enable it to produce better results than all other ILP systems whose results on this problem have been reported. Tests with a selection of list-processing problems from Bratko's introductory Prolog text demonstrate that the combination of implicit negatives and intensionality allow FOIDL to learn correct programs from far fewer examples than FOIL.

  12. Learning Parse Decisions From Examples With Rich Context
    Ulf Hermjakob and Raymond J. Mooney
    Submitted to the 34th Annual Meeting of the Association for Computational Linguistics (ACL-96).

    We present a knowledge and context-based system for parsing natural language and evaluate it on sentences from the Wall Street Journal. Applying machine learning techniques, the system uses parse action examples acquired under supervision to generate a deterministic shift-reduce parser in the form of a decision structure. It relies heavily on context, as encoded in features which describe the morpholgical, syntactical, semantical and other aspects of a given parse state.

  13. Corpus-Based Lexical Acquisition For Semantic Parsing
    Cynthia Thompson
    Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin, 1995.

    Building accurate and efficient natural language processing (NLP) systems is an important and difficult problem. There has been increasing interest in automating this process. The lexicon, or the mapping from words to meanings, is one component that is typically difficult to update and that changes from one domain to the next. Therefore, automating the acquisition of the lexicon is an important task in automating the acquisition of NLP systems. This proposal describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that learns a lexicon from input consisting of sentences paired with representations of their meanings. Preliminary experimental results show that this system can learn correct and useful mappings. The correctness is evaluated by comparing a known lexicon to one learned from the training input. The usefulness is evaluated by examining the effect of using the lexicon learned by WOLFIE to assist a parser acquisition system, where previously this lexicon had to be hand-built. Future work in the form of extensions to the algorithm, further evaluation, and possible applications is discussed.

  14. Refinement of Bayesian Networks by Combining Connectionist and Symbolic Techniques
    Sowmya Ramanchandran
    Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin, 1995.

    Bayesian networks provide a mathematically sound formalism for representing and reasoning with uncertain knowledge and are as such widely used. However, acquiring and capturing knowledge in this framework is difficult. There is a growing interest in formulating techniques for learning Bayesian networks inductively. While the problem of learning a Bayesian network, given complete data, has been explored in some depth, the problem of learning networks with unobserved causes is still open. In this proposal, we view this problem from the perspective of theory revision and present a novel approach which adapts techniques developed for revising theories in symbolic and connectionist representations. Thus, we assume that the learner is given an initial approximate network (usually obtained from a expert). Our technique inductively revises the network to fit the data better. Our proposed system has two components: one component revises the parameters of a Bayesian network of known structure, and the other component revises the structure of the network. The component for parameter revision maps the given Bayesian network into a multi-layer feedforward neural network, with the parameters mapped to weights in the neural network, and uses standard backpropagation techniques to learn the weights. The structure revision component uses qualitative analysis to suggest revisions to the network when it fails to predict the data accurately. The first component has been implemented and we will present results from experiments on real world classification problems which show our technique to be effective. We will also discuss our proposed structure revision algorithm, our plans for experiments to evaluate the system, as well as some extensions to the system.

  15. Refinement-Based Student Modeling and Automated Bug Library Construction
    Paul Baffes and Raymond Mooney
    Journal of Artificial Intelligence in Education, 7, 1 (1996), pp. 75-116.

    A critical component of model-based intelligent tutoring sytems is a mechanism for capturing the conceptual state of the student, which enables the system to tailor its feedback to suit individual strengths and weaknesses. To be useful such a modeling technique must be practical, in the sense that models are easy to construct, and effective, in the sense that using the model actually impacts student learning. This research presents a new student modeling technique which can automatically capture novel student errors using only correct domain knowledge, and can automatically compile trends across multiple student models. This approach has been implemented as a computer program, ASSERT, using a machine learning technique called theory refinement, which is a method for automatically revising a knowledge base to be consistent with a set of examples. Using a knowledge base that correctly defines a domain and examples of a student's behavior in that domain, ASSERT models student errors by collecting any refinements to the correct knowledege base which are necessary to account for the student's behavior. The efficacy of the approach has been demonstrated by evaluating ASSERT using 100 students tested on a classification task covering concepts from an introductory course on the C++ programming language. Students who received feedback based on the models automatically generated by ASSERT performed significantly better on a post test than students who received simple teaching.

  16. Comparative Results on Using Inductive Logic Programming for Corpus-based Parser Construction
    John M. Zelle and Raymond J. Mooney
    Symbolic, Connectionist, and Statistical Approaches to Learning for Natural Language Processing, S. Wermter, E. Riloff and G. Scheler, Eds, Spring Verlag, 1996.

    This paper presents results from recent experiments with CHILL, a corpus-based parser acquisition system. CHILL treats language acquisition as the learning of search-control rules within a logic program. Unlike many current corpus-based approaches that use statistical learning algorithms, CHILL uses techniques from inductive logic programming (ILP) to learn relational representations. CHILL is a very flexible system and has been used to learn parsers that produce syntactic parse trees, case-role analyses, and executable database queries. The reported experiments compare CHILL's performance to that of a more naive application of ILP to parser acquisition. The results show that ILP techniques, as employed in CHILL, are a viable alternative to statistical methods and that the control-rule framework is fundamental to CHILL's success.

  17. Learning the Past Tense of English Verbs Using Inductive Logic Programming
    Raymond J. Mooney and Mary Elaine Califf
    Symbolic, Connectionist, and Statistical Approaches to Learning for Natural Language Processing, S. Wermter, E. Riloff and G. Scheler, Eds, Spring Verlag, 1996.

    This paper presents results on using a new inductive logic programming method called FOIDL to learn the past tense of English verbs. The past tense task has been widely studied in the context of the symbolic/connectionist debate. Previous papers have presented results using various neural-network and decision-tree learning methods. We have developed a technique for learning a special type of Prolog program called a first-order decision list, defined as an ordered list of clauses each ending in a cut. FOIDL is based on FOIL (Quinlan, 1990) but employs intensional background knowledge and avoids the need for explicit negative examples. It is particularly useful for problems that involve rules with specific exceptions, such as the past-tense task. We present results showing that FOIDL learns a more accurate past-tense generator from significantly fewer examples than all other previous methods.

  18. Hybrid Learning of Search Control for Partial-Order Planning
    Tara A. Estlin and Raymond J. Mooney
    New Directions in AI Planning, M. Ghallab and A. Milani, Eds, IOS Press, 1996, pp. 129-140.

    This paper presents results on applying a version of the DOLPHIN search-control learning system to speed up a partial-order planner. DOLPHIN integrates explanation-based and inductive learning techniques to acquire effective clause-selection rules for Prolog programs. A version of the UCPOP partial-order planning algorithm has been implemented as a Prolog program and DOLPHIN used to automatically learn domain-specific search control rules that help eliminate backtracking. The resulting system is shown to produce significant speedup in several planning domains.

  19. Revising Bayesian Network Parameters Using Connectionist Methods
    Sowmya Ramachandran and Raymond J. Mooney
    Proceedings of the International Conference on Neural Networks (ICNN-96), Special Session on Knowledge-Based Artificial Neural Networks, Washington DC, June 1996.

    The problem of learning Bayesian networks with hidden variables is known to be a hard problem. Even the simpler task of learning just the conditional probabilities on a Bayesian network with hidden variables is hard. In this paper, we present an approach that learns the conditional probabilities on a Bayesian network with hidden variables by transforming it into a multi-layer feedforward neural network (ANN). The conditional probabilities are mapped onto weights in the ANN, which are then learned using standard backpropagation techniques. To avoid the problem of exponentially large ANNs, we focus on Bayesian networks with noisy-or and noisy-and nodes. Experiments on real world classification problems demonstrate the effectiveness of our technique.

  20. Qualitative Multiple-Fault Diagnosis of Continuous Dynamic Systems Using Behavioral Modes
    Siddarth Subramanian
    Ph.D. Thesis, Department of Computer Sciences, University of Texas at Austin, August, 1995.

    As systems like chemical plants, power plants, and automobiles get more complex, online diagnostic systems are becomingly increasingly important. One of the ways to rein in the complexity of describing and reasoning about large systems such as these is to describe them using qualitative rather than quantitative models.

    Model-based diagnosis is a class of diagnostic techniques that use direct knowledge about how a system functions instead of expert rules detailing causes for every possible set of symptons of a broken system. Our research builds on standard methods for model-based diagnosis and extends them to the domain of complex dynamic systems described using qualitative models.

    We motivate and describe out algorithm for diagnosing faults in a dynamic system given a qualitative model and a sequence of qualitative states. The main contributions in this algorithm include a method for propagating dependencies while solving a general constraint satisfaction problem, and a method for verfying the compatibility of a behavior with a model across time. The algorithm can diagnose multiple faults and uses models of faulty behavior, or behavioral modes.

    We then demonstrate these techniques using an implemented program called QDOCS and test it on some realistic problems. Through our experiments with a model of the reaction control system (RCS) of the space shuttle and with a level-controller for a reaction tank, we show that QDOCS demonstrates the best balance of generality, accuracy and efficiency among known systems.

  21. Using Inductive Logic Programming to Automate the Construction of Natural Language Parsers
    John M. Zelle
    Ph.D. Thesis, Department of Computer Sciences, University of Texas at Austin, August, 1995.

    Designing computer systems to understand natural language input is a difficult task. In recent years there has been considerable interest in corpus-based methods for constructing natural language parsers. These empirical approaches replace hand-crafted grammars with linguistic models acquired through automated training over language corpora. A common thread among such methods to date is the use of propositional or probablistic representations for the learned knowledge. This dissertation presents an alternative approach based on techniques from a subfield of machine learning known as inductive logic programming (ILP). ILP, which investigates the learning of relational (first-order) rules, provides an empirical method for acquiring knowledge within traditional, symbolic parsing frameworks.

    This dissertation details the architecture, implementation and evaluation of CHILL a computer system for acquiring natural language parsers by training over corpora of parsed text. CHILL treats language acquisition as the learning of search-control rules within a logic program that implements a shift-reduce parser. Control rules are induced using a novel ILP algorithm which handles difficult issues arising in the induction of search-control heuristics. Both the control-rule framework and the induction algorithm are crucial to CHILL's success.

    The main advantage of CHILL over propositional counterparts is its flexibility in handling varied representations. CHILL has produced parsers for various analyses including case-role mapping, detailed syntactic parse trees, and a logical form suitable for expressing first-order database queries. All of these tasks are accomplished within the same framework, using a single, general learning method that can acquire new syntactic and semantic categories for resolving ambiguities.

    Experimental evidence from both aritificial and real-world corpora demonstrate that CHILL learns parsers as well or better than previous artificial neural network or probablistic approaches on comparable tasks. In the database query domain, which goes beyond the scope of previous empirical approaches, the learned parser outperforms an existing hand-crafted system. These results support the claim that ILP techniques as implemented in CHILL represent a viable alternative with significant potential advantages over neural-network, propositional, and probablistic approaches to empirical parser construction.

  22. An Inductive Logic Programming Method for Corpus-based Parser Construction
    John M. Zelle and Raymond J. Mooney
    Submitted to Computational Linguistics

    In recent years there has been considerable research into corpus-based methods for parser construction. A common thread in this research has been the use of propositional representations for learned knowledge. This paper presents an alternative approach based on techniques from a subfield of machine learning known as inductive logic programming (ILP). ILP, which investigates the learning of relational (first-order) rules, provides a way of using empricial methods to acquire knowledge within traditional, symbolic parsing frameworks. We describe a novel method for constructing deterministic Prolog parsers from corpora of parsed sentences. We also discuss several advantages of this approach compared to propositional alternatives and present experimental results on learning complete parsers using several corpora including the ATIS corpus from the Penn Treebank.

  23. A Comparison of Two Methods Employing Inductive Logic Programming for Corpus-based Parser Constuction
    John M. Zelle and Raymond J. Mooney
    Working Notes of the IJCAI-95 Workshop on New Approaches to Learning for Natural Language Processing, pp.79-86, Montreal, Quebec, August, 1995.

    This paper presents results from recent experiments with CHILL, a corpus-based parser acquisition system. CHILL treats grammar acquisition as the learning of search-control rules within a logic program. Unlike many current corpus-based approaches that use propositional or probabilistic learning algorithms, CHILL uses techniques from inductive logic programming (ILP) to learn relational representations. The reported experiments compare CHILL's performance to that of a more naive application of ILP to parser acquisition. The results show that ILP techniques, as employed in CHILL, are a viable alternative to propositional methods and that the control-rule framework is fundamental to CHILL's success.

  24. Inducing Logic Programs without Explicit Negative Examples
    John M. Zelle, Cynthia A. Thompson, Mary Elaine Califf, and Raymond J. Mooney
    Proceedings of the Fifth International Workshop on Inductive Logic Programming, Leuven, Belguim, Sepetember 1995.

    This paper presents a method for learning logic programs without explicit negative examples by exploiting an assumption of output completeness. A mode declaration is supplied for the target predicate and each training input is assumed to be accompanied by all of its legal outputs. Any other outputs generated by an incomplete program implicitly represent negative examples; however, large numbers of ground negative examples never need to be generated. This method has been incorporated into two ILP systems, CHILLIN and IFOIL, both of which use intensional background knowledge. Tests on two natural language acquisition tasks, case-role mapping and past-tense learning, illustrate the advantages of the approach.

  25. Acquisition of a Lexicon from Semantic Representations of Sentences
    Cynthia A. Thompson
    33rd Annual Meeting of the Association of Computational Linguistics, pp. 335-337, Boston, MA July 1995 (ACL-95).

    A system, WOLFIE, that acquires a mapping of words to their semantic representation is presented and a preliminary evaluation is performed. Tree least general generalizations (TLGGs) of the representations of input sentences are performed to assist in determining the representations of individual words in the sentences. The best guess for a meaning of a word is the TLGG which overlaps with the highest percentage of sentence representations in which that word appears. Some promising experimental results on a non-artificial data set are presented.

  26. Induction of First-Order Decision Lists: Results on Learning the Past Tense of English Verbs
    Raymond J. Mooney and Mary Elaine Califf
    Journal of Artificial Intelligence Research, 3 (1995) pp. 1-24.
    This paper presents a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a cut. The method, called FOIDL, is based on FOIL but employs intensional background knowledge and avoids the need for explicit negative examples. It is particularly useful for problems that involve rules with specific exceptions, such as learning the past-tense of English verbs, a task widely studied in the context of the symbolic/connectionist debate. FOIDL is able to learn concise, accurate programs for this problem from significantly fewer examples than previous methods (both connectionist and symbolic).

  27. Multiple-Fault Diagnosis Using General Qualitative Models with Fault Modes
    Siddarth Subramanian and Raymond J. Mooney
    Working Notes of the IJCAI-95 Workshop on Engneering Problems for Qualitative Reasoning, Monreal, Quebec, August 1995.

    This paper describes an approach to diagnosis of systems described by qualitative differential equations represented as QSIM models. An implemented system QDOCS is described that performs multiple-fault, fault-model based diagnosis, using constraint satisfaction techniques, of qualitative behaviors of systems described by such models. We demonstrate the utility of this system by accurately diagnosing randomly generated faults using simulated behaviors of a portion of the Reaction Control System of the space shuttle.

  28. Learning Qualitative Models for Systems with Multiple Operating Regions
    Sowmya Ramachandran, Raymond J. Mooney and Benjamin J. Kuipers
    Proceedings of the Eight International Workshop of Qualitative Reasoning about Physical Systems, pp. 212-223, Nara, Japan, June 1994. (QR-94)
    The problem of learning qualitative models of physical systems from observations of its behaviour has been addressed by several researchers in recent years. Most current techniques limit themselves to learning a single qualitative differential equation to model the entire system. However, many systems have several qualitative differential equations underlying them. In this paper, we present an approach to learning the models for such systems. Our technique divides the behaviours into segments, each of which can be explained by a single qualitative differential equation. The qualitative model for each segment can be generated using any of the existing techniques for learning a single model. We show that results of applying our technique to several examples and demonstrate that it is effective.

  29. Multiple-Fault Diagnosis Using General Qualitative Models with Fault Modes
    Siddarth Subramanian and Raymond J. Mooney
    Working Papers of the Fifth International Workshop on Principles of Diagnosis, pp. 321-325, New Paltz, NY, 1994.

    This paper describes an approach to diagnosis of systems described by qualitative differential equations represented as QSIM models. An implemented system QDOCS is described that performs multiple-fault, fault-model based diagnosis, using constraint satisfaction techniques, of qualitative behaviors of systems described by such models. We demonstrate the utility of this system by accurately diagnosing randomly generated faults using simulated behaviors of a portion of the Reaction Control System of the space shuttle.

  30. Automatic Student Modeling and Bug Library Construction using Theory Refinement
    Paul T. Baffes
    Ph.D. Thesis, Department of Computer Sciences, University of Texas at Austin, December, 1994.

    The history of computers in education can be characterized by a continuing effort to construct intelligent tutorial programs which can adapt to the individual needs of a student in a one-on-one setting. A critical component of these intelligent tutorials is a mechanism for modeling the conceptual state of the student so that the system is able to tailor its feedback to suit individual strengths and weaknesses. The primary contribution of this research is a new student modeling technique which can automatically capture novel student errors using only correct domain knowledge, and can automatically compile trends across multiple student models into bug libraries. This approach has been implemented as a computer program, ASSERT, using a machine learning technique called theory refinement which is a method for automatically revising a knowledge base to be consistent with a set of examples. Using a knowledge base that correctly defines a domain and examples of a student's behavior in that domain, ASSERT models student errors by collecting any refinements to the correct knowledge base which are necessary to account for the student's behavior. The efficacy of the approach has been demonstrated by evaluating ASSERT using 100 students tested on a classification task using concepts from an introductory course on the C++ programming language. Students who received feedback based on the models automatically generated by ASSERT performed significantly better on a post test than students who received simple reteaching.

  31. Inductive Learning For Abductive Diagnosis
    Cynthia A. Thompson and Raymond J. Mooney
    Proceedings of the Twelfth National Conference on AI, pp. 664-669, Seattle, WA, July 1994. (AAAI-94)

    A new inductive learning system, LAB (Learning for ABduction), is presented which acquires abductive rules from a set of training examples. The goal is to find a small knowledge base which, when used abductively, diagnoses the training examples correctly and generalizes well to unseen examples. This contrasts with past systems that inductively learn rules that are used deductively. Each training example is associated with potentially multiple categories (disorders), instead of one as with typical learning systems. LAB uses a simple hill-climbing algorithm to efficiently build a rule base for a set-covering abductive system. LAB has been experimentally evaluated and compared to other learning systems and an expert knowledge base in the domain of diagnosing brain damage due to stroke.

  32. Comparing Methods For Refining Certainty Factor Rule-Bases
    J. Jeffrey Mahoney and Raymond J. Mooney
    Proceedings of the Eleventh International Workshop on Machine Learning, pp. 173-180, Rutgers, NJ, July 1994. (ML-94)

    This paper compares two methods for refining uncertain knowledge bases using propositional certainty-factor rules. The first method, implemented in the RAPTURE system, employs neural-network training to refine the certainties of existing rules but uses a symbolic technique to add new rules. The second method, based on the one used in the KBANN system, initially adds a complete set of potential new rules with very low certainty and allows neural-network training to filter and adjust these rules. Experimental results indicate that the former method results in significantly faster training and produces much simpler refined rule bases with slightly greater accuracy.

  33. Modifying Network Architectures For Certainty-Factor Rule-Base Revision
    J. Jeffrey Mahoney and Raymond J. Mooney
    Proceedings of the International Symposium on Integrating Knowledge and Neural Heuristics 1994, pp. 75-84, Pensacola, FL, May 1994. (ISIKNH-94)

    This paper describes RAPTURE --- a system for revising probabilistic rule bases that converts symbolic rules into a connectionist network, which is then trained via connectionist techniques. It uses a modified version of backpropagation to refine the certainty factors of the rule base, and uses ID3's information-gain heuristic (Quinlan) to add new rules. Work is currently under way for finding improved techniques for modifying network architectures that include adding hidden units using the UPSTART algorithm (Frean). A case is made via comparison with fully connected connectionist techniques for keeping the rule base as close to the original as possible, adding new input units only as needed.

  34. Combining Top-Down And Bottom-Up Techniques In Inductive Logic Programming
    John M. Zelle, Raymond J. Mooney and Joshua B. Konvisser
    Proceedings of the Eleventh International Workshop on Machine Learning, pp. 343-351, Rutgers, NJ, July 1994. (ML-94)

    This paper describes a new method for inducing logic programs from examples which attempts to integrate the best aspects of existing ILP methods into a single coherent framework. In particular, it combines a bottom-up method similar to GOLEM with a top-down method similar to FOIL. It also includes a method for predicate invention similar to CHAMP and an elegant solution to the ``noisy oracle'' problem which allows the system to learn recursive programs without requiring a complete set of positive examples. Systematic experimental comparisons to both GOLEM and FOIL on a range of problems are used to clearly demonstrate the advantages of the approach.

  35. Inducing Deterministic Prolog Parsers From Treebanks: A Machine Learning Approach
    John M. Zelle and Raymond J. Mooney
    Proceedings of the Twelfth National Conference on AI, pp. 748-753, Seattle, WA, July 1994. (AAAI-94)

    This paper presents a method for constructing deterministic, context-sensitive, Prolog parsers from corpora of parsed sentences. Our approach uses recent machine learning methods for inducing Prolog rules from examples (inductive logic programming). We discuss several advantages of this method compared to recent statistical methods and present results on learning complete parsers from portions of the ATIS corpus.

  36. Integrating ILP and EBL
    Raymond J. Mooney and John M. Zelle
    SIGART Bulletin, Volume 5, number 1, Jan. 1994, pp 12-21.

    This paper presents a review of recent work that integrates methods from Inductive Logic Programming (ILP) and Explanation-Based Learning (EBL). ILP and EBL methods have complementary strengths and weaknesses and a number of recent projects have effectively combined them into systems with better performance than either of the individual approaches. In particular, integrated systems have been developed for guiding induction with prior knowledge (ML-SMART, FOCL, GRENDEL) refining imperfect domain theories (FORTE, AUDREY, Rx), and learning effective search-control knowledge (AxA-EBL, DOLPHIN).

  37. Extending Theory Refinement to M-of-N Rules
    Paul T. Baffes and Raymond J. Mooney
    Informatica, 17 (1993), pp. 387-397.

    In recent years, machine learning research has started addressing a problem known as theory refinement. The goal of a theory refinement learner is to modify an incomplete or incorrect rule base, representing a domain theory, to make it consistent with a set of input training examples. This paper presents a major revision of the EITHER propositional theory refinement system. Two issues are discussed. First, we show how run time efficiency can be greatly improved by changing from a exhaustive scheme for computing repairs to an iterative greedy method. Second, we show how to extend EITHER to refine MofN rules. The resulting algorithm, Neither (New EITHER), is more than an order of magnitude faster and produces significantly more accurate results with theories that fit the MofN format. To demonstrate the advantages of NEITHER, we present experimental results from two real-world domains.

  38. Inductive Learning for Abductive Diagnosis
    Cynthia Thompson
    M.A. Thesis, Department of Computer Sciences, University of Texas at Austin, 1993.

    A new system for learning by induction, called LAB, is presented. LAB (Learning for ABduction) learns abductive rules based on a set of training examples. Our goal is to find a small knowledge base which, when used abductively, diagnoses the training examples correctly, in addition to generalizing well to unseen examples. This is in contrast to past systems, which inductively learn rules which are used deductively. Abduction is particularly well suited to diagnosis, in which we are given a set of symptoms (manifestations) and we want our output to be a set of disorders which explain why the manifestations are present. Each training example is associated with potentially multiple categories, instead of one, which is the case with typical learning systems. Building the knowledge base requires a choice between multiple possibilities, and the number of possibilities grows exponentially with the number of training examples. One method of choosing the best knowledge base is described and implemented. The final system is experimentally evaluated, using data from the domain of diagnosing brain damage due to stroke. It is compared to other learning systems and a knowledge base produced by an expert. The results are promising: the rule base learned is simpler than the expert knowledge base and rules learned by one of the other systems, and the accuracy of the learned rule base in predicting which areas are damaged is better than all the other systems as well as the expert knowledge base.

  39. Learning to Model Students: Using Theory Refinement to Detect Misconceptions
    Paul T. Baffes
    Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin, 1993.

    A new student modeling system called ASSERT is described which uses domain independent learning algorithms to model unique student errors and to automatically construct bug libraries. ASSERT consists of two learning phases. The first is an application of theory refinement techniques for constructing student models from a correct theory of the domain being tutored. The second learning cycle automatically constructs the bug library by extracting common refinements from multiple student models which are then used to bias future modeling efforts. Initial experimental data will be presented which suggests that ASSERT is a more effective modeling system than other induction techniques previously explored for student modeling, and that the automatic bug library construction significantly enhances subsequent modeling efforts.

  40. Learning Search-Control Heuristics for Logic Programs: Applications to Speedup Learning and Language Acquisition
    John M. Zelle
    Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin, 1993.

    This paper presents a general framework, learning search-control heuristics for logic programs, which can be used to improve both the efficiency and accuracy of knowledge-based systems expressed as definite-clause logic programs. The approach combines techniques of explanation-based learning and recent advances in inductive logic programming to learn clause-selection heuristics that guide program execution. Two specific applications of this framework are detailed: dynamic optimization of Prolog programs (improving efficiency) and natural language acquisition (improving accuracy). In the area of program optimization, a prototype system, DOLPHIN, is able to transform some intractable specifications into polynomial-time algorithms, and outperforms competing approaches in several benchmark speedup domains. A prototype language acquisition system, CHILL, is also described. It is capable of automatically acquiring semantic grammars, which uniformly incorprate syntactic and semantic constraints to parse sentences into case-role representations. Initial experiments show that this approach is able to construct accurate parsers which generalize well to novel sentences and significantly outperform previous approaches to learning case-role mapping based on connectionist techniques. Planned extensions of the general framework and the specific applications as well as plans for further evaluation are also discussed.

  41. Combining FOIL and EBG to Speed-Up Logic Programs
    John M. Zelle and Raymond J. Mooney
    Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1106-1111, Chambery, France, 1993. (IJCAI-93)

    This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm is shown to be an improvement over competing EBL approaches in several domains. Additionally, the algorithm is capable of automatically transforming some intractable algorithms into ones that run in polynomial time.

  42. Symbolic Revision of Theories With M-of-N Rules
    Paul T. Baffes and Raymond J. Mooney
    Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1135-1140, Chambery, France, 1993. (IJCAI-93)

    This paper presents a major revision of the EITHER propositional theory refinement system. Two issues are discussed. First, we show how run time efficiency can be greatly improved by changing from a exhaustive scheme for computing repairs to an iterative greedy method. Second, we show how to extend EITHER to refine M-of-N rules. The resulting algorithm, NEITHER (New EITHER), is more than an order of magnitude faster and produces significantly more accurate results with theories that fit the M-of-N format. To demonstrate the advantages of NEITHER, we present preliminary experimental results comparing it to EITHER and various other systems on refining the DNA promoter domain theory.

  43. Learning Semantic Grammars With Constructive Inductive Logic Programming
    John M. Zelle and Raymond J. Mooney
    Proceedings of the Eleventh National Conference of the American Association for Artificial Intelligence, pp. 817-822, Washington, D.C. July 1993 (AAAI-93).

    Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new first-order induction algorithm that automatically invents useful syntactic and semantic categories. Empirical results show that the learned parsers generalize well to novel sentences and out-perform previous approaches based on connectionist techniques.

  44. Combining Connectionist and Symbolic Learning to Refine Certainty-Factor Rule-Bases
    J. Jeffrey Mahoney and Raymond J. Mooney
    Connection Science, 5 (1993), pp. 339-364. (Special issue on Architectures for Integrating Neural and Symbolic Processing)

    This paper describes Rapture --- a system for revising probabilistic knowledge bases that combines connectionist and symbolic learning methods. Rapture uses a modified version of backpropagation to refine the certainty factors of a Mycin-style rule base and it uses ID3's information gain heuristic to add new rules. Results on refining three actual expert knowledge bases demonstrate that this combined approach generally performs better than previous methods.

  45. Refinement of First-Order Horn-Clause Domain Theories
    Bradley L. Richards and Raymond J. Mooney
    Machine Learning 19,2 (1995), pp. 95-131.

    Knowledge acquisition is a difficult and time-consuming task, and as error-prone as any human activity. The task of automatically improving an existing knowledge base using learning methods is addressed by a new class of systems performing theory refinement. Until recently, such systems were limited to propositional theories. This paper presents a system, FORTE (First-Order Revision of Theories from Examples), for refining first-order Horn-clause theories. Moving to a first-order representation opens many new problem areas, such as logic program debugging and qualitative modelling, that are beyond the reach of propositional systems. FORTE uses a hill-climbing approach to revise theories. It identifies possible errors in the theory and calls on a library of operators to develop possible revisions. The best revision is implemented, and the process repeats until no further revisions are possible. Operators are drawn from a variety of sources, including propositional theory refinement, first-order induction, and inverse resolution. FORTE has been tested in several domains including logic programming and qualitative modelling.

  46. Encouraging Experimental Results on Learning CNF
    Raymond J. Mooney
    Machine Learning 19,1 (1995) pp. 79-92.

    This paper presents results comparing three inductive learning systems using different representations for concepts, namely: CNF formulae, DNF formulae, and decision trees. The CNF learner performs surprisingly well. Results on five natural data sets show that it frequently trains faster and produces more accurate and simpler concepts. The probable explanation for its superior performance is that the other systems are more susceptible to the replication problem.

  47. Belief Revision in the Context of Abductive Explanation
    Siddarth Subramanian
    Technical Report AI92-179, Artificial Intelligence Lab,
    University of Texas at Austin, March 1991.

    This proposal presents an approach to explanation that incorporates the paradigms of belief revision and abduction. We present an algorithm that combines these techniques and a system called BRACE that is a preliminary implementation of this algorithm. We show the applicability of the BRACE approach to a wide range of domains including scientific discovery, device diagnosis and plan recognition. Finally, we describe our proposals for a new implementation, new application domains for our system and extensions to this approach.

  48. A First-Order Horn-Clause Abductive System and Its Use in Plan Recognition and Diagnosis
    Hwee Tou Ng and Raymond J. Mooney
    Submitted for journal publication.

    A diverse set of intelligent activities, including natural language understanding and diagnosis, requires the ability to construct explanations for observed phenomena. In this paper, we view explanation as abduction, where an abductive explanation is a consistent set of assumptions which, together with background knowledge, logically entails a set of observations. We have successfully built a domain-independent system, ACCEL, in which knowledge about a variety of domains is uniformly encoded in first-order Horn-clause axioms. A general-purpose abduction algorithm, AAA, efficiently constructs explanations in the various domains by caching partial explanations to avoid redundant work. Empirical results show that caching of partial explanations can achieve more than an order of magnitude speedup in run time. We have applied our abductive system to two general tasks: plan recognition in text understanding, and diagnosis of medical diseases, logic circuits, and dynamic systems. The results indicate that ACCEL is a general-purpose system capable of plan recognition and diagnosis, yet efficient enough to be of practical utility.

  49. Abductive Plan Recognition and Diagnosis: A Comprehensive Empirical Evaluation
    Hwee Tou Ng and Raymond J. Mooney
    Proceedings of the Third International Conference on Principles of Knowledge Representation and Reasoning, pp. 499-508, Cambridge, MA, October 1992.

    While it has been realized for quite some time within AI that abduction is a general model of explanation for a variety of tasks, there have been no empirical investigations into the practical feasibility of a general, logic-based abductive approach to explanation. In this paper we present extensive empirical results on applying a general abductive system, ACCEL, to moderately complex problems in plan recognition and diagnosis. In plan recognition, ACCEL has been tested on 50 short narrative texts, inferring characters' plans from actions described in a text. In medical diagnosis, ACCEL has diagnosed 50 real-world patient cases involving brain damage due to stroke (previously addressed by set-covering methods). ACCEL also uses abduction to accomplish model-based diagnosis of logic circuits (a full adder) and continuous dynamic systems (a temperature controller and the water balance system of the human kidney). The results indicate that general purpose abduction is an effective and efficient mechanism for solving problems in plan recognition and diagnosis.

  50. Automatic Abduction of Qualitative Models
    Bradley L. Richards, Ina Kraan, and Benjamin J. Kuipers
    Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 723-728, San Jose, CA, July 1992.

    We describe a method of automatically abducing qualitative models from descriptions of behaviors. We generate, from either quantitative or qualitative data, models in the form of qualitative differential equations suitable for use by QSIM. Constraints are generated and filtered both by comparison with the input behaviors and by dimensional analysis. If the user provides complete information on the input behaviors and the dimensions of the input variables, the resulting model is unique, maximally constrainted, and guaranteed to reproduce the input behaviors. If the user provides incomplete information, our method will still generate a model which reproduces the input behaviors, but the model may no longer be unique. Incompleteness can take several forms: missing dimensions, values of variables, or entire variables.

  51. Learning Relations by Pathfinding
    Bradley L. Richards and Raymond J. Mooney
    Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 50-55, San Jose, CA, July 1992.

    First-order learning systems (e.g., FOIL, FOCL, FORTE) generally rely on hill-climbing heuristics in order to avoid the combinatorial explosion inherent in learning first-order concepts. However, hill-climbing leaves these systems vulnerable to local maxima and local plateaus. We present a method, called relational pathfinding, which has proven highly effective in escaping local maxima and crossing local plateaus. We present our algorithm and provide learning results in two domains: family relationships and qualitative model building.

  52. Speeding-up Logic Programs by Combining EBG and FOIL
    John M. Zelle and Raymond J. Mooney
    Proceedings of the 1992 Machine Learning Workshop on Knowledge Compilation and Speedup Learning, Aberdeen Scotland, July 1992.

    This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm produces not only EBL-like speed up of problem solvers, but is capable of automatically transforming some intractable algorithms into ones that run in polynomial time.

  53. Combining Symbolic and Neural Learning to Revise Probabilistic Theories
    J. Jeffrey Mahoney & Raymond J. Mooney
    Proceedings of the 1992 Machine Learning Workshop on Integrated Learning in Real Domains, Aberdeen Scotland, July 1992.

    This paper describes RAPTURE --- a system for revising probabilistic theories that combines symbolic and neural-network learning methods. RAPTURE uses a modified version of backpropagation to refine the certainty factors of a Mycin-style rule-base and it uses ID3's information gain heuristic to add new rules. Results on two real-world domains demonstrate that this combined approach performs as well or better than previous methods.

  54. Growing Layers of Perceptrons: Introducing the Extentron Algorithm
    Paul T. Baffes and John M. Zelle
    Proceedings of the 1992 International Joint Conference on Neural Networks, pp. 392-397, Baltimore, Maryland, June 1992

    The ideas presented here are based on two observations of perceptrons: (1) when the perceptron learning algorithm cycles among hyperplanes, the hyperplanes may be compared to select one that gives a best SPLIT of the examples, and (2) it is always possible for the perceptron to build a hyperplane that separates at least one example from all the rest. We describe the Extentron, which grows multi-layer networks capable of distinguishing non-linearly-separable data using the simple perceptron rule for linear threshold units. The resulting algorithm is simple, very fast, scales well to large problems, retains the convergence properties of the perceptron, and can be completely specified using only two parameters. Results are presented comparing the Extentron to other neural network paradigms and to symbolic learning systems.

  55. Using Theory Revision to Model Students and Acquire Stereotypical Errors
    Paul T. Baffes and Raymond J. Mooney
    Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society, pp. 617-622, Bloomington, IN, July 1992.

    Student modeling has been identified as an important component to the long term development of Intelligent Computer-Aided Instruction (ICAI) systems. Two basic approaches have evolved to model student misconceptions. One uses a static, predefined library of user bugs which contains the misconceptions modeled by the system. The other uses induction to learn student misconceptions from scratch. Here, we present a third approach that uses a machine learning technique called theory revision. Using theory revision allows the system to automatically construct a bug library for use in modeling while retaining the flexibility to address novel errors.

  56. A Preliminary PAC Analysis of Theory Revision
    Raymond J. Mooney
    Computational Learning Theory and Natural Learning Systems, Vol. 3, T. Petsche, S. Judd, and S. Hanson, Eds., MIT Press, 1995, pp. 43-53.

    This paper presents a preliminary analysis of the sample complexity of theory revision within the framework of PAC (Probably Approximately Correct) learnability theory. By formalizing the notion that the initial theory is ``close'' to the correct theory we show that the sample complexity of an optimal propositional Horn-clause theory revision algorithm is $O( ( \ln 1 / \delta + d \ln (s_0 + d + n) ) / \epsilon)$, where $d$ is the {\em syntactic distance} between the initial and correct theories, $s_0$ is the size of initial theory, $n$ is the number of observable features, and $\epsilon$ and $\delta$ are the standard PAC error and probability bounds. The paper also discusses the problems raised by the computational complexity of theory revision.

  57. Automated Debugging of Logic Programs via Theory Revision
    Raymond J. Mooney & Bradley L. Richards
    Proceedings of the Second International Workshop on Inductive Logic Programming, Tokyo, Japan, June 1992.

    This paper presents results on using a theory revision system to automatically debug logic programs. FORTE is a recently developed system for revising function-free Horn-clause theories. Given a theory and a set of training examples, it performs a hill-climbing search in an attempt to minimally modify the theory to correctly classify all of the examples. FORTE makes use of methods from propositional theory revision, Horn-clause induction (FOIL), and inverse resolution. The system has has been successfully used to debug logic programs written by undergraduate students for a programming languages course.

  58. Batch versus Incremental Theory Refinement
    Raymond J. Mooney
    Proceedings of AAAI Spring Symposium on Knowledge Assimilation, Standford, CA, March, 1992.

    Most existing theory refinement systems are not incremental. However, any theory refinement system whose input and output theories are compatible can be used to incrementally assimilate data into an evolving theory. This is done by continually feeding its revised theory back in as its input theory. An incremental batch approach, in which the system assimilates a batch of examples at each step, seems most appropriate for existing theory revision systems. Experimental results with the EITHER theory refinement system demonstrate that this approach frequently increases efficiency without significantly decreasing the accuracy or the simplicity of the resulting theory. However, if the system produces bad initial changes to the theory based on only small amount of data, these bad revisions can ``snowball'' and result in an overall decrease in performance.

  59. A Multistrategy Approach to Theory Refinement
    Raymond J. Mooney & Dirk Ourston
    Machine Learning: A Multistrategy Approach, Vol. IV, R.S. Michalski & G. Teccuci (eds.), pp.141-164, Morgan Kaufman, San Mateo, CA, 1994.

    This chapter describes a multistrategy system that employs independent modules for deductive, abductive, and inductive reasoning to revise an arbitrarily incorrect propositional Horn-clause domain theory to fit a set of preclassified training instances. By combining such diverse methods, EITHER is able to handle a wider range of imperfect theories than other theory revision systems while guaranteeing that the revised theory will be consistent with the training data. EITHER has successfully revised two actual expert theories, one in molecular biology and one in plant pathology. The results confirm the hypothesis that using a multistrategy system to learn from both theory and data gives better results than using either theory or data alone.

  60. Integrating Theory and Data in Category Learning
    Raymond J. Mooney
    Categorization by Humans and Machines: The Psychology of Learning and Motivation, Vol. 29, G. Nakamura, R. Taraban, & D.L. Medin (Eds.), pp. 189-218, Academic Press, Orlando, FL, 1993.

    Recent results in both machine learning and cognitive psychology demonstrate that effective category learning involves an integration of theory and data. First, theories can bias induction, affecting what category definitions are extracted from a set of examples. Second, conflicting data can cause theories to be revised. Third, theories can alter the representation of data through feature formation. This chapter reviews two machine learning systems that attempt to integrate theory and data in one or more of these ways. IOU uses a domain theory to acquire part of a concept definition and to focus induction on the unexplained aspects of the data. EITHER uses data to revise an imperfect theory and uses theory to add abstract features to the data. Recent psychological experiments reveal that these machine learning systems exhibit several important aspects of human category learning. Specifically, IOU has been used to successfully model some recent experimental results on the effect of functional knowledge on category learning.

  61. Theory Refinement Combining Analytical and Empirical Methods
    Dirk Ourston and Raymond J. Mooney
    Artificial Intelligence, 66 (1994), pp. 311--344.

    This article describes a comprehensive approach to automatic theory revision. Given an imperfect theory, the approach combines explanation attempts for incorrectly classified examples in order to identify the failing portions of the theory. For each theory fault, correlated subsets of the examples are used to inductively generate a correction. Because the corrections are focused, they tend to preserve the structure of the original theory. Because the system starts with an approximate domain theory, in general fewer training examples are required to attain a given level of performance (classification accuracy) compared to a purely empirical system. The approach applies to classification systems employing a propositional Horn-clause theory. The system has been tested in a variety of application domains, and results are presented for problems in the domains of molecular biology and plant disease diagnosis.

  62. Induction Over the Unexplained: Using Overly-General Domain Theories to Aid Concept Learning
    Raymond J. Mooney
    Machine Learning, 10, 1 (1993), pp. 79-110.

    This paper describes and evaluates an approach to combining empirical and explanation-based learning called Induction Over the Unexplained (IOU). IOU is intended for learning concepts that can be partially explained by an overly-general domain theory. An eclectic evaluation of the method is presented which includes results from all three major approaches: empirical, theoretical, and psychological. Empirical results shows that IOU is effective at refining overly-general domain theories and that it learns more accurate concepts from fewer examples than a purely empirical approach. The application of theoretical results from PAC learnability theory explains why IOU requires fewer examples. IOU is also shown to be able to model psychological data demonstrating the effect of background knowledge on human learning.

  63. An Efficient First-Order Horn-Clause Abduction System Based on the ATMS
    Hwee Tou Ng and Raymond J. Mooney
    Proceedings of the Ninth National Conference on Artificial Intelligence, pages 494-499, Anaheim, CA, July 1991.

    This paper presents an algorithm for first-order Horn-clause abduction that uses an ATMS to avoid redundant computation. This algorithm is either more efficient or more general than any other previous abduction algorithm. Since computing all minimal abductive explanations is intractable, we also present a heuristic version of the algorithm that uses beam search to compute a subset of the simplest explanations. We present empirical results on a broad range of abduction problems from text understanding, plan recognition, and device diagnosis which demonstrate that our algorithm is at least an order of magnitude faster than an alternative abduction algorithm that does not use an ATMS.

  64. Improving Shared Rules in Multiple Category Domain Theories
    Dirk Ourston and Raymond J. Mooney
    Proceedings of the Eighth International Machine Learning Workshop, pp. 534-538, Evanston, IL, June 1991.

    This paper presents an approach to improving the classification performance of a multiple category theory by correcting intermediate rules which are shared among the categories. Using this technique, the performance of a theory in one category can be improved through training in an entirely different category. Examples of the technique are presented and experimental results are given.

  65. Constructive Induction in Theory Refinement
    Raymond J. Mooney and Dirk Ourston
    Proceedings of the Eighth International Machine Learning Workshop, pp. 178-182, Evanston, IL. June 1991.

    This paper presents constructive induction techniques recently added to the EITHER theory refinement system. These additions allow EITHER to handle arbitrary gaps at the ``top,'' ``middle,'' and/or ``bottom'' of an incomplete domain theory. Intermediate concept utilization employs existing rules in the theory to derive higher-level features for use in induction. Intermediate concept creation employs inverse resolution to introduce new intermediate concepts in order to fill gaps in a theory that span multiple levels. These revisions allow EITHER to make use of imperfect domain theories in the ways typical of previous work in both constructive induction and theory refinement. As a result, EITHER is able to handle a wider range of theory imperfections than does any other existing theory refinement system.

  66. Theory Refinement with Noisy Data
    Raymond J. Mooney and Dirk Ourston
    Technical Report AI 91-153, Artificial Intelligence Lab, University of Texas at Austin, March 1991.

    This paper presents a method for revising an approximate domain theory based on noisy data. The basic idea is to avoid making changes to the theory that account for only a small amount of data. This method is implemented in the EITHER propositional Horn-clause theory revision system. The paper presents empirical results on artificially corrupted data to show that this method successfully prevents over-fitting. In other words, when the data is noisy, performance on novel test data is considerably better than revising the theory to completely fit the data. When the data is not noisy, noise processing causes no significant degradation in performance. Finally, noise processing increases efficiency and decreases the complexity of the resulting theory.

  67. On the Role of Coherence in Abductive Explanation
    Hwee Tou Ng and Raymond J. Mooney
    Proceedings of the Eighth National Conference on Artificial Intelligence, pages 337-342, Boston, MA, 1990.

    Abduction is an important inference process underlying much of human intelligent activities, including text understanding, plan recognition, disease diagnosis, and physical device diagnosis. In this paper, we describe some problems encountered using abduction to understand text, and present some solutions to overcome these problems. The solutions we propose center around the use of a different criterion, called explanatory coherence, as the primary measure to evaluate the quality of an explanation. In addition, explanatory coherence plays an important role in the construction of explanations, both in determining the appropriate level of specificity of a preferred explanation, and in guiding the heuristic search to efficiently compute explanations of sufficiently high quality.


    estlin@cs.utexas.edu