From postmaster@x400.hd.uib.no Wed Oct 7 01:54:33 1992 Date: Wed, 7 Oct 1992 00:54:33 +0100 From: Knut Hofland To: Receivers of list CORPORA Subject: Adm. message For a while, I have been holding and inspecting messages to the Corpora list, to filter out requests for addition/deletion etc. The situation now seems reasonable stable so I go back to the automatic redistribution. You have seen several addresses starting like corp..@..hd.uib.no. Please use only one of these (preferably corpora@nora.hd.uib.no or corpora@x400.hd.uib.no) and do not CC: to the other addresses). I have also changed distribution system from a X400 mailer to a Unix mailer, due to problems with some addresses with _ (underscore). I am sorry that in this process, some of you got a couple of error messages that was not supposed to go to the list. Old messages can be read by connection with Gopher to nora.hd.uib.no or by using FTP or our mail based server fileserv@hd.uib.no as described in the welcoming message. If you have not got the welcoming message, you can request this by sending the line: send corpora corpora.list.welcome to: fileserv@hd.uib.no Knut Hofland From corplst Wed Oct 7 18:14:58 1992 Date: Wed, 7 Oct 1992 18:14:58 +0100 From: corplst (CORPORA list) To: corpora Subject: re: Non-English taggers and tagged corpora Send-date: Wed, 7 Oct 1992 17:16:40 UTC+0100 From: Patrick John Coppock To: Message-ID: corpora:107 204*C=no;PRMD=uninett;O=unit;OU=avh;S=patCoppock Subject: re: Non-English taggers and tagged corpora To: "eijk p"@cecehv.ENET.dec.com Subject: Non-English taggers and tagged corpora You ask: I was wondering whether there are people reading this list who are working on (or have references to) taggers for languages other than English. I am especially interested in German, French and Spanish. Iassume you have heard of the CHILDES project at Carnegie Mellon university. If not, then you might be interested in contacting them. The have amongst other things, a corpus of child language texts, coded according to the CHILDES system (Child Language Data Exchange System). There is also program- ware available for both Mac and IBM-DOS for working with corpuses tagged using this system. The coordinator of the project is Jeff McWhinney, and the e-mail address for info. is: childes@andrew.cmu.edu. It is possible to ftp stuff from the CHILDES corpus from poppy.psy.cmu.edu (ip 128.2.298.42) best wishes pat coppock dept of applied linguistics University of Trondheim 7055 DRAGVOLL Norway From corplst Wed Oct 7 23:58:37 1992 Date: Wed, 7 Oct 1992 23:58:37 +0100 From: corplst (CORPORA list) To: corpora Subject: Non-English taggers and tagged corpora Send-date: Wed, 7 Oct 1992 12:18:02 UTC-0600 From: ted To: Message-ID: corpora:109 9210071818.AA11745(a)NMSU.Edu Subject: Non-English taggers and tagged corpora There is also programware available for both Mac and IBM-DOS for working with corpuses tagged using this system. what sort of tags are we talking about? i was looking for programs which would accept text as input and produce text+part of speech tags as output. i didn't think that the childes project had anything of that sort at all. From corplst Thu Oct 8 09:55:43 1992 Date: Thu, 8 Oct 1992 09:55:43 +0100 From: corplst (CORPORA list) To: corpora Subject: Re: Non-English taggers and tagged corpora Send-date: Thu, 8 Oct 1992 9:05:55 UTC+0100 From: (Helmut Feldweg) To: (CORPORA list) Subject: Re: Non-English taggers and tagged corpora > i was looking for programs which would accept text as input and > produce text+part of speech tags as output. i didn't think that the > childes project had anything of that sort at all. > The CHILDES project has a part of speech tagger for English, a similar system for German is under development. These taggers require the data to be formatted according to the CHILDES transcript conventions. - Helmut Feldweg From eijk_p@cecehv.enet.dec.com Thu Oct 8 09:44:28 1992 Date: Thu, 8 Oct 92 09:35:52 MET From: 08-Oct-1992 0841 To: corpora@nora.hd.uib.no Apparently-To: corpora@nora.hd.uib.no Subject: Non-English taggers tagged corpora ted writes: what sort of tags are we talking about? i was looking for programs which would accept text as input and produce text+part of speech tags as output. In my original query I indeed used tagging in the linguistic sense of adding grammatical (and lexeme) information (in a broad sense) to a _word_, as opposed to reconstructing (marking) the _phrasal_ structure of text (for which tags are used in mark up languages). For those interested in Dutch, I have a stochastic tagger for Dutch, based on a trigram model discussed in Church [1988] and trained using the "Eindhoven" tagged corpus of Dutch. It assigns syntactic category (aka part of speech) labels to words, and has some extra facilities for e.g. unknown words. I developed the tagger to use it in combination with an existing alignment program and extract information from parallel (aligned) bilingual texts. However, there don't exist many Dutch-English bilingual corpora, that I know of. Pim van der Eijk. From feldweg@bach.sns.neuphilologie.uni-tuebingen.de Fri Oct 9 16:02:00 1992 From: feldweg@bach.sns.neuphilologie.uni-tuebingen.de (Helmut Feldweg) Subject: Re: Non-English taggers and tagged corpora To: rda@cogsci.edinburgh.ac.uk (Robert Dale) Date: Fri, 9 Oct 92 11:47:03 MET Cc: corpora@nora.hd.uib.no, rda@cogsci.edinburgh.ac.uk In-Reply-To: <634.9210090941@scott.cogsci.ed.ac.uk>; from "Robert Dale" at Oct 9, 92 10:41 am X-Mailer: ELM [version 2.3 PL11] Sounds like CHILDES is not known to everybody in this list, so I'll give a short introduction here: CHILDES (Child Language Data Exchange System) comprises (a) a set of transcript conventions tailored for child language (b) a large collection of computerized language acquisition data, most of the data formated according to (a) (c) a set of computer programs for analyzing (b) (b) and (c) are available on CD-ROM from CMU (write to CHILDES@andrew.cmu.edu or to Brian MacWhinney (brian@andrew.cmu.edu) for a copy) and through anonymous ftp (poppy.psy.cmu.edu). Although CHILDES yields free access for research purposes, one is officially required to become a 'CHILDES-member' before using the data (by sending an informal note to Brian MacWhinney). There are also some European centers of CHILDES, one of which is located at the Max-Planck for Psycholinguistics in Nijmegen. Transcript conventions, software and data collection are thoroughly described in: Brian MacWhinney: The CHILDES Project: Tools for Analyzing Talk. Hillsdale: Lawrence Erlbaums, 1991. The ISBN for the paperback is 51-0-8058-1006-4 and the price is $29.95. Not described in that manual are recent software developments. One of the developments is a morphological tagger based on the ECAT parser of Roland Hauser. It takes utterances transcribed according to (a) as input and attaches a so-called mor-tier with full morphological analysis to it. The basic engine of the parser is language independent. Language-specific information is stored in separate files. CHILDES currently supplies such files for English. It is said that German versions are in preparation. Ambiguous forms are 'handled' in two ways: the interactive version of the parser generates a menu of choices and asks the user to select one of it, the non-interactive version writes all possible alternatives to the output and marks the forms as ambiguous. Brian MacWhinney should be able to provide more information on this. A preliminary version of the user manual for the newer programs is available via ftp as a packed MS-Word file for the Mac at poppy.psy.cmu.edu (128.2.248.4), directory clan/macintosh, file update.sit.hqx. Helmut Feldweg (formerly coordinator for the CHILDES and ESF-databases at the MPI for Psycholinguistics, Nijmegen) Seminar f"ur allgemeine Sprachwissenschaft, Universit"at T"ubingen Wilhelmstr. 113, D-7400 T"ubingen 1, Germany email: feldweg@mailserv.zdv.uni-tuebingen.de feldweg@bach.sns.neuphilologie.uni-tuebingen.de phone: +31 (0)7071 29-4279 From louisa@essex.ac.uk Fri Oct 9 12:55:31 1992 Date: Fri, 9 Oct 1992 11:55:31 +0100 From: Sadler L To: corpora@nora.hd.uib.no Subject: RUSSIAN CORPUS RESOURCES Does anyone know of any MR corpus of Russian texts available? Thanks. Louisa Sadler Department of Language and Lingusitics Universisty of Essex Wivenhoe Park Colchester CO4 3SQ UK fax 44 206 872085 From rda@cogsci.edinburgh.ac.uk Fri Oct 9 11:41:47 1992 Via: uk.ac.edinburgh.cogsci; Fri, 9 Oct 1992 10:43:12 +0100 To: feldweg@bach.sns.neuphilologie.uni-tuebingen.de Cc: corpora@nora.hd.uib.no, rda@cogsci.edinburgh.ac.uk Subject: Re: Non-English taggers and tagged corpora In-Reply-To: Your message of "Thu, 08 Oct 92 09:55:43 BST." <199210080855.AA14328@nora.hd.uib.no> Date: Fri, 09 Oct 92 10:41:47 +0100 From: Robert Dale > The CHILDES project has a part of speech tagger for English, a similar > system for German is under development. These taggers require the data > to be formatted according to the CHILDES transcript conventions. Could you possibly point me to some information on the CHILDES project? I've never heard of it before. R From corplst Sun Oct 11 23:49:45 1992 Date: Sun, 11 Oct 1992 23:49:45 +0100 From: corplst (CORPORA list) To: corpora Subject: Re: Non-English taggers and tagged corpora Send-date: Sun, 11 Oct 1992 19:44:20 UTC+0100 From: Patrick John Coppock To: (CORPORA list) Message-ID: corpora:125 206*C=no;PRMD=uninett;O=unit;OU=avh;S=patCoppock Subject: Non-English taggers and tagged corpora No, you're quite right. CHILDES involves a lot of work coding (or tagging?) the texts. I wasn't in fact aware that there were programs available for doing this kind of thing automatically. They must be very "intelligent" in that case. It will be interesting to hear form any others on this topic. pat coppock (The reason I mentioned CHILDES was because they have a considerable corpus of coded texts in different languages available) pc From corplst Mon Oct 12 13:45:57 1992 Date: Mon, 12 Oct 1992 13:45:57 +0100 From: corplst (CORPORA list) To: corpora Subject: Re: Non-English taggers and tagged corpora Send-date: Mon, 12 Oct 1992 21:08:17 UTC+1000 From: LLOYD HOLLIDAY, LA TROBE UNIV, EDUCATION To: Subject: Re: Non-English taggers and tagged corpora Since we are on the topic of CHILDES, Does anyone have an ESL/EFL tagged corpus for CHILDES? Lloyd Holliday edulh@lure.latrobe.edu.au From postmaster@x400.hd.uib.no Wed Oct 14 14:56:12 1992 Date: Wed, 14 Oct 1992 13:56:12 +0100 From: NEUMAN@guvax.acc.georgetown.edu To: humbul@mail.rl.ac.uk, asis-l@uvmvm.uvm.edu, mediev-l@ukanvm.cc.ukans.edu, classics@uwavm.u.washington.edu, histownr@ubvm.bitnet, iassist@pucc.bitnet, corpora@x400.hd.uib.no, gbloomq@acadvm1.uottawa.ca Subject: Please post 2nd call. X-Vms-To: @NETWORK.LIS;10 X-Vms-Cc: NEUMAN Please consider posting this reminder of the upcoming deadline for submissions to ACH-ALLC93. Thank you. M.N. ...................................................................... Dear Colleagues, November 1st, the deadline for submitting proposals for ACH-ALLC93, is fast approaching. We welcome your inquiries and your submissions. For more details, see the call for papers below. Regards, Michael Neuman Georgetown University for ACH-ALLC93 ......................................................................... ASSOCIATION FOR COMPUTERS AND THE HUMANITIES ASSOCIATION FOR LITERARY AND LINGUISTIC COMPUTING 1993 JOINT INTERNATIONAL CONFERENCE ACH-ALLC93 JUNE 16-19, 1993 GEORGETOWN UNIVERSITY, WASHINGTON, D.C. CALL FOR PAPERS This conference is the major forum for literary, linguistic and humanities computing. It is concerned with the development of new computing methodologies for research and teaching in the humanities, the development of significant new networked-based and computer-based resources for humanities research, and the application and evaluation of computing techniques in humanities subjects. TOPICS: We welcome submissions on topics such as text encoding; statistical methods for text analysis; hypertext; text corpora; computational lexicography; morphological, syntactic, semantic and other forms of text analysis; also, computer applications in history, philosophy, music and other humanities disciplines. For the 1993 conference, ACH and ALLC extend a special invitation to members of the library community to contribute to the conference on the topics of creating and cataloguing network-based resources in the humanities, developing and integrating databases of texts and images of works central to the humanities, and refining retrieval techniques for humanities databases. LOCATION: Georgetown, an historic residential district along the Potomac River, is a six-mile ride by taxi from Washington National Airport. International flights arrive at Dulles Airport, which offers regular bus service to the Nation's Capital. REQUIREMENTS: Proposals should describe substantial and original work. Proposals describing the development of new computing methodologies should make clear how these methodologies are applied to research and/or teaching in the humanities. Those concerned with a particular application (e.g., a study of the style of an author) should cite previous approaches to the problem and should include some critical assessment of the computing methodologies used. All proposals should include references to important sources. ABSTRACT LENGTH: Abstracts of 1500-2000 words in length should be submitted for presentations of thirty minutes including questions. SESSION PROPOSALS: Proposals for sessions (90 minutes) are also invited. These should take the form of either: (a) Three papers. The session organizer should submit a 500-word statement describing the session topic, include abstracts of 1000-1500 words for each paper, and indicate that each author is willing to participate in the session. (b) A panel of up to 6 speakers. The panel organizer should submit an abstract of 1500-2000 words describing the panel topic, how it will be organized, the names of all the speakers, and an indication that each speaker is willing to participate in the session. DEADLINE FOR SUBMISSIONS: November 1, 1992 NOTIFICATION OF ACCEPTANCE: February 1, 1993 FORMAT FOR SUBMISSIONS: Electronic submissions are strongly encouraged, and should follow strictly the format given below. Submissions that do not conform to this format will be returned to the authors for reformatting, or may not be considered if they arrive near the deadline. All submissions should include a header in the following format: TITLE: title of paper AUTHOR(S): names of authors AFFILIATION: affiliations of author(s) CONTACT ADDRESS: full postal address of main author (for contact) E-MAIL: electronic mail address of main author followed by other authors (if any) FAX NUMBER: fax for main author PHONE NUMBER: phone for main author ELECTRONIC SUBMISSIONS: Please submit plain ASCII text files. Files that include formatting by a wordprocessor, TAB characters, and soft hyphens are not acceptable. Paragraphs should be separated by blank lines. Headings and subheadings should be on separate lines and be numbered. References (up to six) and notes should appear at the end of the abstract. Where necessary, a simple markup scheme for accents and other characters that cannot be transmitted by electronic mail should be used; provide an explanation of the markup scheme after the title information. If diagrams are necessary for the evaluation of an electronic submission, they should be faxed to 1-202-687-6003 (after dialing one's international access code) or 202-687-6003 (from within the US), and a note to indicate the presence of diagrams should be inserted at the beginning of the abstract. Address for electronic submissions: Neuman@GUVAX.Georgetown.edu (include a subject line " Submission for ACH-ALLC93"). PAPER SUBMISSIONS: Submissions should be typed or printed on one side of the paper only, with ample margins. Six copies should be sent to ACH-ALLC93 (Paper submission) Dr. Michael Neuman Academic Computer Center 238 Reiss Science Building Georgetown University Washington, D.C. 20057 PUBLICATION: A selection of papers presented at the conference will be published in the series Research in Humanities Computing edited by Susan Hockey and Nancy Ide, published by Oxford University Press. INTERNATIONAL PROGRAM COMMITTEE Chair: Marianne Gaunt, Rutgers University (ACH) Thomas Corns, University of Wales, Bangor (ALLC) Paul Fortier, University of Manitoba (ACH) Jacqueline Hamesse, Universite Catholique Louvain-la-Neuve (ALLC) Susan Hockey, Rutgers and Princeton Universities (ALLC) Nancy Ide, Vassar College (ACH) Randall Jones, Brigham Young University (ACH) Michael Neuman, Georgetown University (ACH) (Local organizer) Antonio Zampolli, University of Pisa (ALLC) INQUIRIES Please address all inquiries to: ACH-ALLC93 Dr. Michael Neuman, Local Organizer Academic Computer Center 238 Reiss Science Building Georgetown University Washington, D.C. 20057 Phone: 202-687-6096 FAX: 202-687-6003 Bitnet: Neuman@Guvax Internet: Neuman@Guvax.Georgetown.edu Please include your name, full mailing address, telephone and fax numbers, and e-mail address with any inquiry. From postmaster@x400.hd.uib.no Sun Oct 18 13:08:59 1992 Date: Sun, 18 Oct 1992 12:08:59 +0100 From: " (CORPORA list)" To: corpora@nora.hd.uib.no Subject: Portuguese Send-date: Fri, 16 Oct 1992 19:23:46 UTC+0100 From: To: Subject: Portuguese Can anyone help me find corpora for Brazilian and Continental Portuguese? Thanks, Ken Beesley beesley.parc@xerox.com From postmaster@x400.hd.uib.no Sun Oct 18 13:09:17 1992 Date: Sun, 18 Oct 1992 12:09:17 +0100 From: " (CORPORA list)" To: corpora@nora.hd.uib.no Subject: Unicode Implementors Workshop, Sulzbach (Taunus) Germany Send-date: Fri, 16 Oct 1992 21:37:38 UTC+0100 From: To: , , , Subject: Unicode Implementors Workshop, Sulzbach (Taunus) Germany UNICODE / ISO 10646 IMPLEMENTERS WORKSHOP #4 December 3 & 4 1992 Sulzbach (Taunus) Germany This is an announcement for the upcoming fourth workshop in a series of workshops on implementing the Unicode Standard and sponsored by the Unicode Consortium. This is the first time the workshop is being offered in Europe. Should you not be able to attend yourself, please pass this notice on to somebody else who might be interested in attending this workshop. Unicode: Unicode is an international character encoding standard that encompasses all the world's national scripts in a 16bit code space. Unicode is a profile of the International Standard ISO 10646. Supported by most major computer and software vendors, Unicode greatly facilitates the development of internationally accepted software. All currently used character sets can be transferred to Unicode. Target Audience: Software developers, technical writers, managers or engineers developing or considering development of software for the international market. Date and Time: December 3 and 4, 1992, 9am - 5pm Location: The workshop will be held at the Holiday Inn in the German town of Sulzbach (Taunus). Agenda: December 3 will feature a full day, professionally developed, lecture course covering the Goals and Architecture of the Unicode Standard. It will address the problems posed by support for writing systems worldwide and how Unicode's design enables their solution. In addition, the course will explore several implementation strategies for Unicode and illustrate them with specific examples. December 4 will feature invited papers on implementation aspects of Unicode and its relation to other standards. These seminar-style talks will cover specific, practical problems and point out solutions, as well as provide case studies and demonstrations of Unicode implementations underway. Speakers: Introductory Course Glenn Adams, Metis Technology, Inc. & The Institute for Advanced Professional Studies Keynote Address Gvtz H. Siebrecht General Manager Unisys Deutschland GmbH Unicode and 10646 Isai Scheinberg IBM Corporation-Canada Program Migration to Unicode Alan Barrett Lotus Development Ireland Non-Spacing Marks Mark Davis Taligent, Inc. Operating System Support (Windows N/T) Michel Suignard Microsoft Europe Codeset Conversions Lloyd Honomichl Novell, Inc. Unicode In the XPG/Posix Model Gary Miller IBM Corporation Collating Unicode Data Alain La Bonti Ministry of Communications, Quebec Unicode Support in the Application Development Tool Kit Tuoc Vinh Luong Borland International Internationalization in Windows Past and Future Bill Hall Novell, Inc. Unicode and Print Servers Tadao Yamasaki IBM Corporation/Pennant Unicode/UCS: What does it mean for the European environment? J|rgen Bettels Digital Switzerland Practical Experience with Unicode Bidi Algorithms Alex Morcos Microsoft Corporation Unicode and Network Internationalization Wayne Taylor Novell, Inc. Proceedings: A complete set of notes will be provided for the course at no extra charge. Hotel: The workshop will be held at the Holiday Inn, Sulzbach (Taunus), Germany. The conference rate for a single room is DM 165 per night which includes breakfast. Lunch and dinner on December 3, and lunch on December 4 are included in the conference registration fee. Room arrangements should be made directly with the hotel, requesting the Unicode Workshop rate. Reservations +49-6196-763810 Fax +49-6196-72996 Registration: To register, please complete the registration form to the right and send together with a check, or credit card information, to either of the locations below. Cancellations must be received by November 27, 1992 and will carry a DM 50 (US$37) cancellation fee. Contacts: European Contact: Unicode Implementer's Workshop c/o Unisys Deutschland GmbH Frau Helga Mifka/ML Postfach 1110 D-6231 Sulzbach (Taunus) Germany Phone: +49-6196-991259 Fax: +49-6196-991860 U.S./Canada Contact: Unicode Implementer's Workshop Classic Consulting, International 2249 LeClair Drive Coquitlam, British Columbia V3K 6P6 Canada Phone: 604-931-7600 Fax: 604-937-5898 E-mail: 72630.107@compuserve.com Registration form: Name: ________________________________________ Company: _____________________________________ Address: _____________________________________ City: ________________________________________ Country: _____________________________________ Postal Code: _________________________________ Phone: _______________________________________ Fax: _________________________________________ Non Member: DM 550______ US$370______ Unicode Member: DM 450______ US$300______ __ Check Enclosed __ AMEX __ Visa __ MC Card#: ________________________ Exp Date:_____ Signature: ___________________________________ Make checks payable to Unicode, Inc. Employees of Unicode Full and Associate Members are eligible for the member discount. Cancellations must be received by November 27, 1992 and will carry a cancellation charge of DM 50 (US$37). From postmaster@x400.hd.uib.no Sun Oct 18 13:09:34 1992 Date: Sun, 18 Oct 1992 12:09:34 +0100 From: " (CORPORA list)" To: corpora@nora.hd.uib.no Subject: EFL corpora & tagging Send-date: Fri, 16 Oct 1992 8:18:41 UTC+0100 From: lcjohn To: Subject: EFL corpora & tagging RFC-822-HEADERS: X-Organization: The Hong Kong University of Science & Technology (HKUST) I've just joined this list, so pardon if I'm stating the obvious: there have been a couple of questions about EFL corpora and tagging. Here at HKUST we are compiling a corpora of the written English of Chinese learners; it's still early days, but we hope, if funding comes through, to have 5 million words within a couple of years. The only other project that I know of which is bent on gathering learner English is at the University of Louvain in Belgium. Theirs is, I believe, mainly a European focus. The two taggers for interlanguage analysis I've heard of, both still under development, but due out shortly, are Nijmegen's and one from the University of Sydney called COALA. I'd appreciate hearing of any other work, either on interlanguage corpora collection, or tagging/parsing of said. ** John Milton Language Centre E.mail: lcjohn@usthk.bitnet The Hong Kong University lcjohn@usthk.ust.hk (Internet) of Science and Technology Fax: (852)335 0249 Clearwater Bay Tel: (852)358 7849 Kowloon, HONG KONG From postmaster@x400.hd.uib.no Mon Oct 19 12:28:10 1992 Date: Mon, 19 Oct 1992 11:28:10 +0100 From: " (CORPORA list)" To: corpora@nora.hd.uib.no Subject: RE: Portuguese Send-date: Mon, 19 Oct 1992 8:34:31 UTC+0100 From: To: Subject: RE: Portuguese >To: corpora@nora.hd.uib.no >Subj: Portuguese > >Send-date: Fri, 16 Oct 1992 19:23:46 UTC+0100 >From: >To: >Subject: Portuguese > >Can anyone help me find corpora for Brazilian and Continental Portuguese? > In the context of NERC (Network of European Reference Corpora), the Centro de Linguistica da Universidade de Lisboa is collaborating in an effort to provide corpora for European languages, being responsible for the Portuguese part. The contact for NERC is Antonia Zampolli, glottolo@icnucvm.cnuce.cnr.it > >Thanks, >Ken Beesley >beesley.parc@xerox.com From postmaster@x400.hd.uib.no Tue Oct 20 01:29:56 1992 Date: Tue, 20 Oct 1992 00:29:56 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: [EFL corpora & tagging] Send-date: Mon, 19 Oct 1992 11:44:49 UTC+0100 From: PSP10 To: (CORPORA list) Subject: Re: [EFL corpora & tagging] We at the Cambridge Language Survey are interested in any corpora in English composed of on-native-speaker generated material. We are also compiling our own and will be happy to make these available free of charge. Please keep me in touch. Best wishes, Paul (Procter) From postmaster@x400.hd.uib.no Tue Oct 20 01:30:14 1992 Date: Tue, 20 Oct 1992 00:30:14 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: RE: Portuguese Send-date: Mon, 19 Oct 1992 11:54:43 UTC+0100 From: To: Subject: RE: Portuguese Please correct my typo, the name of the contact for NERC is Antonio Zampolli. From postmaster@x400.hd.uib.no Tue Oct 20 01:30:32 1992 Date: Tue, 20 Oct 1992 00:30:32 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: EFL corpora & tagging Send-date: Mon, 19 Oct 1992 6:55:10 UTC-0700 From: To: Subject: Re: EFL corpora & tagging A concordancer which tags and which is set up for several European languages is Letteratura Amica. Developer/author: Raffaele Cocchi COCCHI@ASTBO1.BO.CNR.IT From postmaster@x400.hd.uib.no Tue Oct 20 01:37:23 1992 Date: Tue, 20 Oct 1992 00:37:23 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: Portuguese Delivery-date: Mon, 19 Oct 1992 12:45:50 UTC+0100 Send-date: Fri, 4 Jan 1980 9:08:56 UTC From: Geraldo Lino de Campos To: (CORPORA list) Subject: Re: Portuguese I don't know of any corpora of Brazilian texts, but I am preparing a file with a trancription of the Vocabulario Ortografico, wich is the official list of Brazilian portugues words (322.000 + entries). It shoul be finished by year end. Geraldo Lino de Campos Escola Politecnia da Universidade de Sao Paulo -- Geraldo Lino de Campos glcampos@pec001.usp.ansp.br Phone 55-11-815-9322 ext 3288 Departamento de Engenharia de FAX 55-11-211-4308 Computacao e Sistemas Digitais Escola Politecnica da Universidade de Sao Paulo From postmaster@x400.hd.uib.no Tue Oct 20 15:04:26 1992 Date: Tue, 20 Oct 1992 14:04:26 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: Portuguese Send-date: Tue, 20 Oct 1992 11:25:00 UTC From: To: Reply-To: Subject: Re: Portuguese RFC-822-HEADERS: X-Organization: The Hong Kong University of Science & Technology (HKUST) I have tried the e-mail address given for Antonia Zampolli, but it doesn't work. Could the person who gave it to us please check to see whether it is right. Many thanks. [ The address of Zampolli is: glottolo@icnucevm.cnuce.cnr.it -Knut ] ^ From postmaster@x400.hd.uib.no Tue Oct 20 15:04:46 1992 Date: Tue, 20 Oct 1992 14:04:46 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: [EFL corpora & tagging] Send-date: Tue, 20 Oct 1992 8:44:00 UTC+0100 From: To: Reply-To: Subject: Re: [EFL corpora & tagging] Dear Paul, An International Corpus of Learner English (ICLE) is in progress under the direction of Sylviane Granger at the Catholic University of Louvain, Belgium. It will consist of compositions by advanced leraners of English from Belgium, Germany, Holland, Sweden and China (so far). The material will form a specialized corpus within the ICE project directed by Sidney Greenbaum. Sylviane Granger's e-mail address is: GRANGER@ETAN.UCL.ac.be Best wishes, Bengt Altenberg From postmaster@x400.hd.uib.no Tue Oct 20 15:05:10 1992 Date: Tue, 20 Oct 1992 14:05:10 +0100 From: " (CORPORA list)" To: corpora@x400.hd.uib.no Subject: Re: EFL corpora & tagging Send-date: Tue, 20 Oct 1992 10:23:39 UTC+0100 From: Patrick John Coppock To: (CORPORA list) Reply-To: Subject: Re: EFL corpora & tagging Could we get some more information on Letteratura Amica? Things like what it tags, what type of machines it works on (eg IBM PC, Mac, Unix etc), price? Anyone tried using it? To do what? How did it work? pat coppock dept. of applied linguistics university of trondheim avh n-7055 dragvoll norway From corpora-request@uib.no Thu Oct 22 21:47:43 1992 Date: Thu, 22 Oct 1992 20:47:43 +0100 From: knut@nora.hd.uib.no (Knut Hofland) To: corpora@nora.hd.uib.no Subject: Adm. message I apologize for the duplication of the last messages to the list. This was due to retransmissions between two mailers at our end. I have now (again) changed to a new and more robust system for redistribution of messages. Please use the address corpora@nora.hd.uib.no when sending messages to the list. Mail to the other addresses that have been used, will be redirected to this address. Knut Hofland From corpora-request@uib.no Thu Oct 22 23:25:02 1992 Date: Thu, 22 Oct 1992 22:25:02 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Re: Portuguese ********************* Text Corpora List *************************** CORPORA@NORA.HD.UIB.NO for messages to the list CORPORA-REQUEST@NORA.HD.UIB.NO for messages to list administrator ******************************************************************* Send-date: Tue, 20 Oct 1992 17:01:57 UTC+0100 From: To: Subject: Re: Portuguese >To: corpora@x400.hd.uib.no >Subj: Re: Portuguese > >Send-date: Tue, 20 Oct 1992 11:25:00 UTC >From: >To: >Reply-To: >Subject: Re: Portuguese > >RFC-822-HEADERS: >X-Organization: The Hong Kong University of Science & Technology (HKUST) > >I have tried the e-mail address given for Antonia Zampolli, but it doesn't >work. Could the person who gave it to us please check to see whether it is >right. >Many thanks. > >[ The address of Zampolli is: glottolo@icnucevm.cnuce.cnr.it >-Knut ] > Another address is "glottolo@icnucevm.bitnet" From corpora-request@uib.no Fri Oct 23 00:58:37 1992 Date: Thu, 22 Oct 1992 23:58:37 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Re: EFL corpora & tagging ********************* Text Corpora List *************************** CORPORA@NORA.HD.UIB.NO for messages to the list CORPORA-REQUEST@NORA.HD.UIB.NO for messages to list administrator ******************************************************************* Send-date: Tue, 20 Oct 1992 14:43:47 UTC-0700 From: To: Subject: Re: EFL corpora & tagging Letteratura Amica (Literary Amiga) is an Amiga program, under development for MS-DOS (Windows?). Better you e-mail Cocchi to discuss its many features because I am not "into" tagging. I can say that it works fine, does wonders (I wish I had a facility for class use with it!), speaks, and is priced right. Macey Taylor maceytay@ccit.arizona.edu (U of AZ) From corpora-request@uib.no Fri Oct 23 01:40:52 1992 Date: Fri, 23 Oct 1992 00:40:52 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Bulgarian corpus ********************* Text Corpora List *************************** CORPORA@NORA.HD.UIB.NO for messages to the list CORPORA-REQUEST@NORA.HD.UIB.NO for messages to list administrator ******************************************************************* Send-date: Wed, 21 Oct 1992 2:33:28 UTC+0100 From: Brian Linson To: Subject: Bulgarian corpus Is there that anyone knows a Bulgarian corpus? blinson@linc.cis.upenn.edu Brian Linson From corpora-request@uib.no Fri Oct 23 11:17:14 1992 Date: Fri, 23 Oct 1992 10:17:14 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Re: Bulgarian corpus ********************* Text Corpora List *************************** CORPORA@NORA.HD.UIB.NO for messages to the list CORPORA-REQUEST@NORA.HD.UIB.NO for messages to list administrator ******************************************************************* Send-date: Fri, 23 Oct 1992 9:25:00 UTC+0100 From: Anna S?gvall Hein To: corplst (CORPORA list) Subject: Re: Bulgarian corpus Yes, there is a corpus of Bulgarian poetry at Uppsala University (Dept. of Slavonic Languages and Dept. of Linguistcs). Interested? From corpora-request@uib.no Fri Oct 23 14:12:38 1992 Date: Fri, 23 Oct 1992 13:12:38 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Re: Bulgarian corpus ********************* Text Corpora List *************************** CORPORA@NORA.HD.UIB.NO for messages to the list CORPORA-REQUEST@NORA.HD.UIB.NO for messages to list administrator ******************************************************************* Send-date: Fri, 23 Oct 1992 12:13:24 UTC+0100 From: To: (CORPORA list) Subject: Re: Bulgarian corpus >Send-date: Wed, 21 Oct 1992 2:33:28 UTC+0100 >From: Brian Linson >To: >Subject: Bulgarian corpus > >Is there that anyone knows a Bulgarian corpus? > >blinson@linc.cis.upenn.edu >Brian Linson There is a small one at the Oxford Text Archive. [ email: archive@vax.ox.ac.uk -Ed] -Kjetil Ra Hauge, U. of Oslo, P.O. Box 1030 Blindern, N-0315 Oslo, Norway -E-mail: K.R.Hauge@easteur-orient.uio.no -Fax: +472-854140 -Phone: +472-856710 From corpora-request@uib.no Fri Oct 23 15:42:02 1992 Date: Fri, 23 Oct 1992 14:42:02 +0100 From: "Henry S. Thompson" To: CORPORA@x400.hd.uib.no Subject: Corpora Does anyone have information about machine-readable text corpora in the following languages (preferably newspaper or novels)? Albanian Basque Finnish Icelandic Latvian Lithuanian Polish Romanian eucorp@cogsci.ed.ac.uk HCRC Edinburgh University David McKelvie From corpora-request@uib.no Fri Oct 23 17:16:58 1992 Date: Fri, 23 Oct 1992 16:16:58 +0100 From: Joseph Raben To: corpora@x400.hd.uib.no Subject: Zampolli's e-mail address In the last few days I have been exchanging e-mail with Zampolli at . Joseph Raben, City University of New York From corpora-request@uib.no Fri Oct 23 08:43:52 1992 Date: Fri, 23 Oct 1992 17:44:55 +0100 From: N o s t a l g i a To: CORPORA@x400.hd.uib.no Cc: matsuda@linc.cis.upenn.edu Subject: Japanese? Posted-Date: Fri, 23 Oct 92 12:43:52 -0400 Hi, Is there any on-line Japanese corpora in any format (e.g. katakana, romaji, etc)? I only know one in Japan in a tape format written in katakana. Any info would be greatly appreciated. Kenjrio Matsuda Dept. of Linguistics Univ. of Pennsylvania matsuda@linc.cis.upenn.edu From corpora-request@uib.no Fri Oct 23 09:30:22 1992 From: bro@elm.circa.ufl.edu (John Bro) Subject: Old French? To: corpora@nora.hd.uib.no (Corpora List) Date: Fri, 23 Oct 92 13:30:22 EDT Reply-To: bro@elm.circa.ufl.edu X-Mailer: ELM [version 2.3 PL11] In the flurry of requests for corpora, I guess I'll add another of mine :) Is there any Old French out there? ============================================================ John Bro | bro@elm.circa.ufl.edu Linguistics | bougie@pine.circa.ufl.edu University of Florida | bougie@ufpine.bitnet Gainesville, Fl 32611 | bro@reef.cis.ufl.edu From corpora-request@uib.no Fri Oct 23 20:47:50 1992 Date: Fri, 23 Oct 1992 19:47:50 +0100 From: parkinson To: CORPORA@nora.hd.uib.no Cc: parkinson Subject: Portuguese I am assembling an archive of e-texts of Modern Portuguese fiction in conjunction with the Portuguese NERC group already mentioned. I also have some 15th and 16th century historical works nearly ready for release. I keep hearing reports of a Brazilian corpus set up by M.T. Biderman at Araraqara. Does anyone know more of this? Stephen Parkinson University of Oxford From corpora-request@uib.no Tue Oct 23 09:58:14 1992 Date: 23 Oct 1992 14:58:14 -0500 From: PVROCKWELL@amherst.edu Subject: corpora & copyright To: CORPORA@nora.hd.uib.no Dear friends, Simply for the information of those who are preparing electronic texts, I would like to pass on a legal opinion that I sollicited when I was approached by scholars interested in using an electronic text database that I had prepared for my own research. Current american copyright law restricts the distribution of electronic texts. I believe that this is the case for international copyright law, but I am not sure. I was informed that recent decisions within the judicial system have made it unlawful to distribute electronic texts that have been scanned from copyrighted material. Current legal opinion holds that an electronic text is analogous to a photocopied text. Researchers do have the right to make one copy of a text that is copyrighted, but for the purposes of their own research only. Distribution of that text, even if no exchange of money is involved, is considered a breach of copyright law. Current penalties for include in the worst scenario the impounding and/or destruction of the machines used to reproduce the copyrighted text. In short, those who produce electronic texts from copyrighted material for the purposes of their own research are, for the time being, in the clear. Those who would reproduce that material and share or distribute it to others in any way are incurring a certain risk. It is not clear that a publisher would have any reason to go after a researcher who wanted to share materials with his or her colleagues. However, that researcher would be left open to both criminal and civil liabilities of various sorts, if indeed the person holding the copyright would care to pursue them. If all this is common knowledge, pleas excuse me for wasting precious time. Best, pvr From corpora-request@uib.no Fri Oct 23 07:15:54 1992 id <23367-0@alf.uib.no>; Fri, 23 Oct 1992 22:16:05 +0100 Fri, 23 Oct 92 14:15:54 -0700 Date: Fri, 23 Oct 92 14:15:54 -0700 From: edwards@cogsci.Berkeley.EDU (Jane Edwards) To: CORPORA@nora.hd.uib.no Subject: NERC Cc: edwards@cogsci.Berkeley.EDU It seems I keep seeing mention to NERC, but have only a very sketchy sense of what it is. I wonder if someone associated with NERC could perhaps provide us with information similar to what we received from the European Corpus Initiative project - i.e., languages covered, nature of the data, some information concerning constraints on access or costs, and when the data are expected to become available? It's obviously an important project, and I for one haven't yet heard even these basic aspects of it. I would like very much to hear more about it. -Jane Edwards (edwards@cogsci.berkeley.edu) From corpora-request@uib.no Fri Oct 23 13:01:00 1992 id <26376-0@alf.uib.no>; Sat, 24 Oct 1992 00:06:18 +0100 id 4592; Fri, 23 Oct 92 18:04:48 CDT Date: Fri, 23 Oct 92 18:01 CDT From: Bob Clark Subject: provencal/occitan/catalan corpora To: CORPORA Does anyone know of the existence of corpora for provencal/occitan or for catalan. I'm interested in both old and modern language. Thanks. Bob Clark Kansas State Univ. From corpora-request@uib.no Thu Oct 25 03:16:34 1992 id <05289-0@alf.uib.no>; Sun, 25 Oct 1992 17:17:01 +0100 id <01GQCZ92DXIE8Y5I3P@CC.USU.EDU>; Sun, 25 Oct 1992 09:16:34 MDT Date: 25 Oct 1992 09:16:34 -0600 (MDT) From: rebecca wheeler Subject: American English corpera? To: corpora@nora.hd.uib.no X-Vms-To: IN%"corpora@nora.hd.uib.no" Mime-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT I'm looking for American English corpera of relatively current sources such as newspapers or novels. I've got a Mac IIsi and a CD-reader. It, as usual, needs to be economical since I'm an independent scholar, without University support. I'm aware of Wordcruncher CD (which has novels by faulkner, cather, london etc.) and of FrontPageNews which has annual international wire services text. The former is a bit older than I want to be dealing with and the latter has a very cumbersome search routine. My purpose is lexical semantic research -- I look at the relationship between semantics, syntax and pragmatics and want to be able to search for a given lexical item, and pull up its syntactic and discourse context. The corpus doesn't have to be tagged -- am willing to do that myself at this point. oh, of course, i'm also aware of the Brown and the LUNDES (sp?) corpus but again am looking for more current corpera. Anyone with info on such contemporary American English corpera (newspapers, novels), let me know. thanks! rebecca wheeler logan, utah From corpora-request@uib.no Tue Oct 27 00:50:04 1992 id <11471-0@alf.uib.no>; Mon, 26 Oct 1992 23:48:41 +0100 Date: Mon, 26 Oct 1992 23:50:04 +0100 From: corplst@nora.hd.uib.no (CORPORA list) To: corpora@nora.hd.uib.no Subject: Re: American English corpera? From: andras Subject: Re: American English corpera? To: corpora@nora.hd.uib.no Date: Mon, 26 Oct 92 10:45:57 PST > I'm looking for American English corpera of relatively current > sources such as newspapers or novels. I've got a Mac IIsi and a CD-reader. > It, as usual, needs to be economical since I'm an independent scholar, > Anyone with info on such contemporary American English corpera > (newspapers, novels), let me know. > rebecca wheeler > logan, utah The Association for Computational Linguistic Data Collection Initiative (ACL-DCI) has already put out a CD-ROM that contains (among other things) the full 1987 and 1988 volumes of the Wall Street Journal and part of 1989. I think that you have to be an ACL member to get it (membership fee approx $40/year, for which you receive the journal "Computational Linguistics" that does discuss corpora-related issues now and again) and you have to pay a largely symbolic fee (<$100). The drawback is that the CD-ROM is in in Unix (High Sierra) file format -- however, I understand there exists a relatively cheap version of Unix for the Mac (sorry I don't have the details). Andras Kornai From corpora-request@uib.no Mon Oct 26 09:31:58 1992 id <17659-0@alf.uib.no>; Tue, 27 Oct 1992 00:32:35 +0100 Mon, 26 Oct 92 16:31:58 MST Date: Mon, 26 Oct 92 16:31:58 MST From: ted To: corplst@nora.hd.uib.no Cc: corpora@nora.hd.uib.no In-Reply-To: CORPORA list's message of Mon, 26 Oct 1992 23:50:04 +0100 <199210262250.AA02357@nora.hd.uib.no> Subject: American English corpera? Reply-To: ted@nmsu.edu ... comments about the ldc/dci cdrom deleted ... The drawback is that the CD-ROM is in in Unix (High Sierra) file format -- however, I understand there exists a relatively cheap version of Unix for the Mac (sorry I don't have the details). in fact, the high sierra format is more of a pc format. in any case, it reflects the msdos limitations on file name length and so on. there are drivers for unix machines like the suns which allow access to these files, but that doesn't make the format a unix format. there should be drivers available for the mac which allow trivial access to the acl/dci disks. From corpora-request@uib.no Tue Oct 27 03:45:58 1992 id <24244-0@alf.uib.no>; Tue, 27 Oct 1992 02:44:33 +0100 Date: Tue, 27 Oct 1992 02:45:58 +0100 From: knut@nora.hd.uib.no (Knut Hofland) To: corpora@nora.hd.uib.no Subject: ACL/DCI CD-ROM The CD-ROM can be read on any PC, Mac or Unix machine. But the texts are coded with the Unix newline character (LF). Mac uses CR and MS-DOS CR+LF. So on a Macintosh one will either have to use a program that understand the Unix newline or use a text conversion utility that translate LF to CR. On a PC the conversion have to be done from LF to CR+LF. On the ICAME CD-ROM (containing Brown, LOB, London-Lund, Kolhapur and Helsinki corpora), we "solved" problem this by duplicating the data in 3 directories, one for PC, one for Mac and one for Unix. Knut Hofland Norwegian Computing Centre for the Humanities, Harald Haarfagres gt. 31, N-5007 Bergen, Norway Phone +47 5 212954/5/6 Fax: +47 5 322656 E-mail: knut@x400.hd.uib.no From corpora-request@uib.no Tue Oct 27 04:12:59 1992 id <25574-0@alf.uib.no>; Tue, 27 Oct 1992 04:10:45 +0100 via SMTP Mon, 26 Oct 92 19:10:33 -0800 for CORPORA@nora.hd.uib.no id AA18962 to CORPORA@nora.hd.uib.no; Mon, 26 Oct 92 19:10:29 PPE Date: Mon, 26 Oct 92 19:10:29 PPE From: lansing@bend.UCSD.EDU (Jeff Lansing) To: CORPORA@nora.hd.uib.no HELP From corpora-request@uib.no Tue Oct 27 14:07:56 1992 Date: Tue, 27 Oct 1992 13:07:56 +0100 Priority: Non-Urgent Dl-Expansion-History: corpora@uib.no ; Tue, 27 Oct 1992 13:07:56 +0100; From: LINTHS@stud.hum.aau.dk To: corpora@x400.hd.uib.no X-Mailer: Pegasus Mail v2.2 (R4). index mac index corpora From corpora-request@uib.no Tue Oct 27 10:21:32 1992 id <28204-0@alf.uib.no>; Wed, 28 Oct 1992 01:28:07 +0100 by acs1.acs.ucalgary.ca (AIX 3.2/UCB 5.64/4.03) id AA43929; Tue, 27 Oct 1992 18:23:16 -0600 Tue, 27 Oct 1992 17:21:33 -0700 From: southerl@acs.ucalgary.ca Subject: Interethnic Conversations To: corpora@nora.hd.uib.no Date: Tue, 27 Oct 92 17:21:32 MST X-Mailer: ELM [version 2.3 PL11s] I'm writing on behalf of a colleague who does not have access to CORPORA. He requests information on any corpora (preferably not too large) containing interethnic conversations in which there is a breakdown of communication (leading to argument). I'll pass along any info to him. Thanks. -- From corpora-request@uib.no Wed Oct 28 14:42:27 1992 id <01919-0@alf.uib.no>; Wed, 28 Oct 1992 13:43:31 +0100 Date: Wed, 28 Oct 1992 13:42:27 +0100 From: PSP10 To: corpora@nora.hd.uib.no Subject: Cambridge Language Survey The Cambridge Language Survey (CLS) is an international, multilingual survey of language coordinated by Cambridge University, UK. Its main activities are the development of sense-tagged corpora within an integrated language database (corpora indexed to their meanings in electronic dictionaries). The materials will eventually be made available to the scholarly community, including corpus materials and software tools for NLP. Would anyone wishing to be kept in touch with these developments who has not yet been in contact with us please drop a note to: Christina Hottner, Cambridge Language Survey, Shaftesbury Road, Cambridge CB2 2RU, UK, with a request to be sent a description of the survey, Paul Procter From corpora-request@uib.no Wed Oct 28 17:28:04 1992 id <09051-0@alf.uib.no>; Fri, 30 Oct 1992 11:33:04 +0100 id <09045-0@alf.uib.no>; Fri, 30 Oct 1992 11:32:57 +0100 id 6509; Fri, 30 Oct 92 11:32:41 EMT id 6508; Fri, 30 Oct 92 11:32:40 EMT id 5460; Wed, 28 Oct 92 15:26:58 TUR id AA04116; Wed, 28 Oct 92 15:28:04 +0200 Date: Wed, 28 Oct 92 15:28:04 +0200 From: ko Wed, 28 Oct 92 15:28:12 +0200 To: corpora@nora.hd.uib.no Subject: Tagger Information Could someone point me to any work that discusses tagger functionality for tagging texts in agglutinative languages such as Finnish or Turkish? Pointers to general tagger information would also be appreciated. Thanks in advance Kemal Oflazer Bilkent University Computer Engineering Department Bilkent, ANKARA, 06533 TURKIYE e-mail: ko@trbilun.bitnet fax: (90) 4 - 266-4127 tel: (90) 4 - 266-4133 From corpora-request@uib.no Fri Oct 30 06:25:51 1992 Date: Fri, 30 Oct 1992 12:25:51 -0600 From: kambou moses To: corpora@nora.hd.uib.no Subject: Enqiry I'd like to receive information concerning bibliogrphies on language policy in a multilingual society. I'm intending to research into language policy in view of an economic intergration. The countries concerned have French, English and Portuguese as official languages. Thanks in advance. Moses Kambou From corpora-request@uib.no Fri Oct 30 12:09:13 1992 Date: Fri, 30 Oct 92 20:09:13 -0800 From: edwards@cogsci.Berkeley.EDU (Jane Edwards) To: corpora@uib.no Subject: Re: tagger for agglutinative languages Jorge Hankamer's parser for Turkish may be of interest (hank@ling.ucsc.edu). Below is a talk abstract; he also has a chapter on parsing agglutinating languages in the Marslen-Wilson (1989) book, Lexical representation and process. Hope this helps. -Jane Edwards --------------------------------- Wednesday, November 4, 1992, 4 PM 182 Dwinelle Hall Jorge Hankamer UC Santa Cruz "Morphology and Morphological Parsing" Abstract: This talk will be a report on a project which I have been engaged in for about ten years. The project involves the development of a computational morphological analyzer as a component of a text-based study of morphology and syntax in modern standard Turkish. What I have in mind is to talk about my parsing project in fairly broad terms, focusing on the interactions between the implementation of an accurate parser and the development of an accurate grammatical description of the morphology and phonology of a language. From corpora-request@uib.no Thu Nov 5 18:14:31 1992 Date: Thu, 5 Nov 1992 17:14:31 +0100 From: "LINGLUND%SELDC52.bitnet" To: corpora@x400.hd.uib.no Subject: Modern Assyrian Does anyone know if there are any machine-readable modern Assyrian texts that are available for research purposes? I would also be grateful to know if anyone is doing research on modern Assyrian based on written texts. Schlemon Moussa Department of Linguistics Lund University Helgonabacken 12 S-223 62 LUND Sweden E-mail: lings@lings.lu.se