Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!godot.cc.duq.edu!newsgate.duke.edu!news.mathworks.com!newsfeed.internetmci.com!in2.uu.net!EU.net!sun4nl!freya.let.rug.nl!let.rug.nl!markjan
From: markjan@let.rug.nl (M.J. Nederhof)
Subject: master class "Finite-State Techniques in NLP" (Netherlands)
Sender: news@let.rug.nl (News system at let.rug.nl)
Message-ID: <1996May22.092141.21874@let.rug.nl>
Date: Wed, 22 May 1996 09:21:41 GMT
Nntp-Posting-Host: grid.let.rug.nl
Organization: Faculty of Arts, University of Groningen
Lines: 129

        "FINITE-STATE TECHNIQUES IN NATURAL LANGUAGE PROCESSING"

              July 8-12, 1996, Groningen (The Netherlands)

      Master class, part of the BCN Summer School, July 1-12, 1996

                        lectures by (among others)

                 MARTIN KAY (Xerox Palo Alto Research Center)
                 GIORGIO SATTA (University of Padua)
                 ATRO VOUTILAINEN (University of Helsinki)
      
For any kind of mechanical processing of input, in whatever form, some kind 
of finite-state process is involved. Much theory has already been developed 
during the early days of computer science, much of it however very
abstract, or at least not readily applicable to processing of natural 
language. The last few years have seen a surge of interesting publications 
that close the gap between the theory of finite-state techniques and the 
practice of computational linguistics. 

During the course of our master class, students will be made familiar with 
these new developments. Three prominent researchers will discuss a wide 
range of topics, including some ideas just emerging in this field. 
Apart from the lectures by the three invited speakers, some ongoing
research at the Humanities Computing Department (Alfa Informatica)
of the University of Groningen will be discussed. To make the course
accessible to students without any prior knowledge of finite-state techniques,
we will start with some introductory lectures on formal language theory,
finite-state automata and transducers, regular languages, rational 
transductions, etc.

The master class will be held in the building of the Faculty of Arts.
It comprises 5 sessions, each from 9:00 to 12:00. The registration fee for 
the master class also covers the other events of the summer school.

For more information concerning this master class, contact the coordinator

             Mark-Jan Nederhof 
             University of Groningen
             Faculty of Arts
             P.O. Box 716
             NL-9700 AS Groningen
             The Netherlands
             E-mail: markjan@let.rug.nl
             Tel. +31-50-3635970
             Fax. +31-50-3636855

and see http://grid.let.rug.nl/~markjan/masterclass.html for the most 
up-to-date version of this document.

For registration and for more information concerning the summer school 
of which this master class is part, see Web page http://www.bcn.rug.nl/ 
or contact 
 
             Office of Graduate School BCN
             Nijenburgh 4
             NL-9747 AG Groningen
             The Netherlands
             E-mail: bureau@bcn.rug.nl
             Tel. +31-50-3634734
             Fax. +31-50-3634740

An overview of the lectures by the invited speakers follows:

============================================================================

Martin Kay

The properties of classical finite-state automata and regular sets,
as well as finite-state transducers and regular relations.  Algorithms that
implement the complete calculus of set theoretic operations on finite
automata, plus some important additional ones such as minimization.  Also
algorithms for useful operations on finite transducers.  The emphasis will
be on methods that can be efficiently applied to large machines such as
arise in phonology, morphology, and the lexicon.

============================================================================

Giorgio Satta

First lecture (90min):
Finite State Tree Automata and Transformation-Based Parsing 

We present the paradigm of transformation-based parsing (Brill, 1993)
and develop efficient parsing algorithms based on finite state tree automata. 

This lecture will cover the following topics.  Top-down tree automata, 
bottom-up tree automata.  Tree regular expressions. Tree automata and 
tree pattern matching algorithms.  Precomputation of tree transformations 
into tree automata.  Transformation-based parsing algorithms. Overlapping 
redexes. 


Second lecture (90min):
Finite State Transducers and Constraint Ranking. 

We present the notion of constraint ranking as developed by recent 
phonological theories. Under the assumption that constraints are 
represented through regular expressions, we develop finite state 
transducer implementations of these theories.  

This lecture will cover the following topics. Optimality Theory (Prince 
and Smolensky 1993). Constraint ranking and constraint violability.  
Optimality systems. Conditional intersection of regular languages.  
Computation of constraint violability through finite state transducers. 

============================================================================

Atro Voutilainen

Surface-oriented reductionistic finite-state parsing

These lectures (3 hours) outline recent work on FS parsing in Helsinki
(Koskenniemi, Tapanainen, Voutilainen). Most of the attention is given
to linguistic rather than algorithmic issues.

- Linguistic representation: morphological, syntactic and word boundary tags.
- Specification of grammatical representation. Grammar definition corpus.
- Representation of morphologically analysed (ambiguous) sentences and
     rules: regular expressions that are compiled into deterministic FSAs
     before parsing.
- Rule formalism: implication rules, rejection rules.
- After lexical analysis, parsing is reductionistic: illegitimate
     readings are discarded; no new readings are added. Parsing by
     intersection: all grammar automata are intersected with the
     (ambiguous) sentence automaton.
- How to write a realistic parsing grammar. Dealing with remaining
     ambiguities.

