LARG is dedicated to reading and discussion on LA-related literature. This is a good way to distribute the cognitive workload for digesting lots of material.

A secondary goal for the group is to serve as a forum for exchanging new LA-related research ideas and discussions on any ongoing research.

Definition of Language Acquisition

According to The MIT Encyclopedia of Cognitive Sciences, 1999:

"Language acquisition refers to the process of attaining a specific variant of human language. ... The fundamental puzzle in understanding this process has to do with the open-ended nature of what is learned."

Loosely based on this idea of "what is learned", the topics of LA are categorized below. For each of them some sample questions are listed but they are not meant to be exhaustive. The overall emphasis is on computational modelling of natural language learning and understanding.

T1. Lexical acquisition: How is the lexical semantics of a language learned by a purely symbolic, connectionist, or stochastic process (or something else)? Should we ground the semantics in the environment (multi-modal acquisition incorporating visual and auditory inputs)? How do metaphors work and how to learn them?

T2. Grammar induction: Given pieces of information of a human language (either by observations or by asking questions to an informant), how can the grammar of the language be induced? How theoretically difficult is the problem? How to make use of an ontology (or world knowledge) to help the induction process? How to automate the morphological analysis (intra-word grammar) of a language?

T3. Discourse/pragmatics learning: How to acquire additional meanings from the discourse/pragmatics cues? How do we derive reliably the set of primitives?

T4. Knowledge representation/inferences for LA: How do we represent the things learned? How to cope with uncertainties and noise (non-monotonic learning)? What kind of inferences can we make based on a particular representation and what are the appropriate inference procedures?

LA research is intrinsically multi-disciplinary, and in LARG the following two perspectives are emphasized:

P1. Computational perspective: How do we realize LA on machines? What are the complexities/costs associated with the approaches?

P2. Cognitive science perspective: Can we separate the general intelligence from the intelligence required by LA? How does a child pick up her mother tongue? What are the relevant facts gathered from the experiments? What are the contemporary learning theories for LA?

P3. Linguistic perspective: What are the general linguistic principles, if any, that make the language learning possible? What linguistic phenomena are common/specific across different languages?

To better position LARG, the following list shows what LARG is *not*. In some cases the boundaries can be elusive.

N1. LARG does not cover speech specific topics, such as post-editing. Speech Lunch is a better group to join if you are interested in these issues.

N2. LARG does not cover parsing specific topics. Parsing Lunch is a better group to join if you are interested in these issues.

N3. LARG does not cover purely linguistic discussions, such as those covered in the Lexical Functional Grammar reading group.

N4. LARG is application agnostic. The focus is on the more fundamental issues (but obviously the techniques can be applied to IR, MT, HCI, etc).


  • Speaker: Every participant takes turn to present papers, book chapters, new research proposals, or reports on any ongoing research. Speakers are not assumed to be experts of the field - it's the mutual learning process we value most in LARG. We may decide to invite outside speakers.

  • Presentation: As long as the topic is LA-related (defined above), feel free to present it in any way you like. Presentation of long material can be beoken down into sequels, which can be intertwined with the other presentations.

  • Material: An extensive bibliography will be made out of the collaborative effort of all participants to include both classic and new works in LA.

  • Discussion: As informal as possible(!). People are allowed to ask any 'immature' questions and the speaker should expect that.

  • Frequency: Bi-weekly.

  • Time: 1530-1700, on Fridays (specific dates will be announced).

  • Place: NSH 4513 (direction).

  • Website:

  • Mailing list: (to subscribte/unscribte please send your request to Kenji Sagae)

version 1.0 ; htmlinst date: Fri Nov 7 16:49:04 2003

Webmaster: Benjamin Han (benhdj at cs dot cmu dot edu)