YAPC
|
talks
45 minute talk
Moreover, there are hundreds of exceptions to the application of these contractions. Some exceptions are systematic, for example: the "ea" in "-eable" words ("impermeable", "peaceable", "knowledgeable", et al.) never contract; other exceptions are specific to just one word: idiosyncratically, the "ea" in "lineage" doesn't contract; the correct spelling is "l_in_e_a_g_e", not "l_in_ea_ge". These exceptions are formalized in a data file bundled with Braille typesetting software freely distributed by the National Federation for the Blind.
The task of Braille encoding (i.e., going from uncontracted text to normal, encoded text) is basically that of scanning each word, looking for sequences of characters that can be contracted, and then replacing those character-substrings with their contractions. The word is scanned in only one pass, going left to right, just as would be implementable with a common string scanning-and-replacement formalism. Moreover, I demonstrate that the task of matching all-and-only the character strings that should be replaced is feasable with regular expressions, specifically.
Consider compression based on a dictionary of rules that consisting simply of
where these are all unrestricted as far as where in the word these can apply. A RE-replacement to match all these target strings would be:
$word =~ s/(the|ea|th|er)/&lookup($1)/eg;
If $word
is "leather", for example, this correctly yields "lXWr"
("l_ea_the_r"), instead of the incorrect "lXYZ" ("l_ea_th_er").
Further elaborations include implementing rule contexts with \b and \B, and automatic generation of a (very large) regexp from the NFB data table of general rules and exceptions.
45 minute talk
So far, efforts have focussed on making the syntax of PLs be like the syntax of NLs, or at least some subset thereof. However, in this talk I seek to point out that linguistic models of NLs have several layers of complexity, and that we should consider, at /each/ of these levels, what similarities already exist between NLs and PLs, whether the basic goals and methods of NLs and PLs are comparable, and the (im)practicality of PLs becoming more like NLs.
I focus on the levels of syntax, semantics, and pragmatics, but I include points from sociolinguistics, historical linguistics, and language typology.
At the level of syntax, I argue that PLs can be made to superficially resemble NLs, but that beyond a certain point, one faces several intractable problems, including unavoidable syntactic ambiguity in NLs; for example, in the NL/PL phrase "if X is greater than Y and Z, then print it", it is inherently unclear to both a human listener and to a PL parser just what "it" refers to, and whether the sentence means "if((X > Y) && (X > Z))..." or "if(X > (Y && Z))...".
At the level of pragmatics, my observations include: living NLs serve a wide range of functions -- the same NL you make a shopping list in, you can chat with a friend in, or give a conference talk in. For each of these functions, the language adapts, with peculiar lexicons and different standards for clarity, organization, and intelligibility to others; so I assert that PLs should be similarly responsive to the adaptabality to special tasks.
Sean is a columnist for The Perl Journal, amongst his many laudable thingies.