From crabapple.srv.cs.cmu.edu!cantaloupe.srv.cs.cmu.edu!das-news.harvard.edu!noc.near.net!howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!koriel!sh.wide!wnoc-kyo!atrwide!atr-la!awb Mon Jul 19 18:23:38 EDT 1993
Article: 533 of comp.ai.nat-lang
Xref: crabapple.srv.cs.cmu.edu comp.ai.nat-lang:533
Newsgroups: comp.ai.nat-lang
Path: crabapple.srv.cs.cmu.edu!cantaloupe.srv.cs.cmu.edu!das-news.harvard.edu!noc.near.net!howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!koriel!sh.wide!wnoc-kyo!atrwide!atr-la!awb
From: awb@itl.atr.co.jp (Alan W Black)
Subject: Re: computational morphologies
In-Reply-To: jahargra@dante.nmsu.edu's message of 15 Jul 1993 01:48:54 GMT
Message-ID: <AWB.93Jul16091137@as53.itl.atr.co.jp>
Sender: news@itl.atr.co.jp (USENET News System)
Nntp-Posting-Host: as53
Organization: ATR Interpreting Telecommunications Research Labs.,Japan
References: <222d26INNich@dns1.NMSU.Edu>
Date: Fri, 16 Jul 1993 00:11:37 GMT
Lines: 150

In article <222d26INNich@dns1.NMSU.Edu> jahargra@dante.nmsu.edu (HARGRAVE III) writes:

 |From: jahargra@dante.nmsu.edu (HARGRAVE III)
 |Greetings,
 |
 |	  I'm looking for information on the following topics:
 |
 |1.  What is the current state-of-the-art in computational morphology?
 |I'm familiar with the KIMMO-like systems.  Any newer alternatives?  I
 |have seen some work on syllable-based morphologies and I would like to
 |know more.
 |

I (personally) feel that the KIMMO two level finite state transducer
model is pretty good for many languages.  It has been tried for a
number of quite different languages and been quite successful.  The
problem with the original Koskenniemi model is that it only offers
finite state morphosyntax as well as finite state morphographemics
(these two aspects of finite stateness are often confused).  Finite
state morphosyntax means that only continuation classes are specified
for each morpheme class.  The work I was involved in allowed
morphosyntax to be specified as a feature grammar thus offering much
more power and making it much easier to specified more complex
morpho-syntax (that is which morphemes can be joined).  Our work is
described in the MIT PRess book Computational Morphology and code is
available by anonymous ftp from
scott.cogsci.ed.ac.uk[129.215.144.3]:/pub/phonology/tools/MAP/MAP3.1.tar.Z

But enough of the plug and let me try to answer your question.  There
are a number of possible extensions to Koskeniemi type models which
have been discussed as opposed to different paradigms (see next
paragraph).  Still within the KIMMO-like systems there are systems
that make the symbols of the transducers non-atomic and include
features.  That is instead on breaking a word into its individual
characters we break it does to a number of feature structures on for
each characters (or phoneme).  Morphological rules that act on these
feature descriptions of the components of morphemes rather than the
simple atomic symbols.  This should make the specification of
phonological rules easier.  Work by Trost (see refs below) is along
those lines.  Making the "characters" more complex means less need for
diacritics (and using upper and lower case to try and make
distinctions between items that really require a more general
descriptive formalism).  Also higher level rule formalisms than the
Koskeniemi context sensitive rewrite rules have been proposed
including one by myself (Black 87) which was later expanded for use in
Phonology (Pulman and Hepple (Cambridge Computing Laboratory, UK) but
I don't have the full reference for that).  Proposals for expanding
the two-level model for semitic language like Arabic have also been
made, typically expanding the number of levels (though formally one
can still look at them as finite state transducers) work such as Kay87
is an example.

But all of the above are really still Koskeniemi like systems.  An
alternative model is the work in what is called paradigmatic morphology.
It concentrates of describing classes and inheritance between classes
(mostly for morpho-syntax).  Such systems are described in Jo Calder's
PhD thesis (Calder 89) and as used in Lynn Cahill's system MOLUSC
(Cahill 90).  The idea is different from the morphosyntax in both
the KIMMO system and our own and is an interesting way to try to 
systematically deal with morphology in languages such as Latin, 
(there was a student at Edinburgh looking at this with respect to
Slavic languages but I'm not sure of the current state of that).

 |2.  Any work done on morphology induction?  Either rule-based or
 |statistical. 
 |

Yes there is work on this though I'm not very familiar with it. An
Edinburgh MPhil some years ago by Andy Golding offered such 
a system (see refs below).  I have been aware of later systems
but unfortunately don't have the references.

 |	  My intent is to implement a generic morphology workstation
 |where where users can interactivley develop high coverage morphologies
 |for use in other systems.  This is part of my masters project.
 |

Sounds interesting.  The problem I find with English morphology
is that it can be viewed as very simple and almost any system can
do a reasonable job on it, or very complex if you wish to include
various aspects of derivational morphology (how far do you go?
do you wish to decompose "microbioology" to micro-bio-ology ?) 
and deal with all exceptions with general rules rather than just
deals with them as exceptions.  Other languages offer much 
more interesting cases.

good luck

Alan

* Alan W Black ---  ATR Interpreting Telecommunications Laboratories *
2-2 Hikaridai                         email: awb@itl.atr.co.jp
Seika-cho, Soraku-gun,                tel: (+81) 7749 5 1314
Kyoto 619-02, Japan                   fax: (+81) 7749 5 1308


@inproceedings(trost90,
 key = "Trost",
 title = "The Application of Two Level Morphology to non-concatenative
{G}erman morphology",
 author = "Trost, H.",
 booktitle = "Proceedings of 13th International Conference on Computational Linguistics",
 pages = "371-376",
 year = 1990
)

@inproceedings(black87,
 key = "Black et al.",
 author = "Black, A. and Ritchie, G. and Pulman, S. and Russell, G.",
 title = "Formalisms for Morphographemic Description",
 booktitle = "Proceedings of 3rd Conference of the European Chapter of the 
Association for Computational Linguistics",
 year = 1987,
 pages = "11-18"
)

@inproceedings(kay87,
 key = "Kay",
 author = "Kay, M.",
 title = "Nonconcatenative Finite-State Morphology",
 booktitle = "Proceedings of 3rd Conference of the European Chapter of the 
Association for Computational Linguistics",
 year = 1987,
 pages = "2-10"
)
 

@inproceedings(calder89,
 key = "Calder",
 author = "Calder, J.",
 title = "Paradigmatic Morphology",
 booktitle = "Proceedings of 4th Conference of European Chapter of the Association for Computational Linguistics",
 year = 1989,
 address = "Manchester",
 pages = "58-65"
)

@inproceedings(cahill90a,
 key = "Cahill",
 author = "Cahill, L.J.",
 title = "Syllable-based morphology",
 booktitle = "Proceedings of 13th International Conference on Computational
              Linguistics",
 year = 1990,
 address = "Helsinki",
 pages = "48-53"
)