* February 25, 2003

Speed up ideas
 - Check to see if equation block is empty before preparing to run and
 running it 
 - Do all of syntactic transfer/generation, then try all underlying
 lexical combinations, including inserted and chosen by FS

Include
 - attempt filters (succeed once and remove)
 - fail always
 - rule watches/step through
 - adding in word based on unification constraints
 - translate just a single word with no syntactic part
   - now needs to consider possible multiple translations
 - (add) ability to parse with string (e.g. "word") in source rhs
 - move nice command line into TransferEngine and combine initfile and
 commandline argument processors
 - have clear (clear just temporary variables for a run) and reset
 (clear rules, etc.) 


Group all rules (syntactic and lexical) into parse groups.
Each parsegroup contains
  - source production rule lhs and rhs
  - parse equation block (? or just use one of the rules' parse eb to
  avoid duplication)
  - rule type (PHRASE, LEX, COMPOUND)
  - vector of rules in block

  - What to do with unknown words?
    - have a special parsegroup for those or create it on the fly?
    - allow user to specify a set of "open class" POS (e.g. N, ADJ, V)
      an unknown could be unsigned these for parsing
    - pre-process text for unknown words, numbers, etc. before parsing
    (?)

  - Have separate method to process words and add in AGENDA
    - known words
    - unknown words that can be processed (dates, numbers)
    - unknown words findable after morphological processing
    - unknown words, add as open class POS
  
Parsing proceeds much as before, except using parsegroups
and not individual rules. (?)


Back-off
  - Keep track of branch points
  - keep back up FS copies that can be restored for next try
  - Still use depth first search and a stack to guide search
    but instead of popping from stack when done with a constituent,
    keep it around, keep track of current active transfer constituent
    and only pop when back-tracking  

Lexical Transfer/Generation
  - run all possibilities first (before combinatory attempts),
  removing those that fail unification; don't use these in later
  combinatory attempts
  - Keep the resulting FSs from the above possibility checks, now just
  need to run (modified) FillTargetFS with different lexical
  combinations, copying over needed FSs
  - for words found by FS unification, find all matches first before
  running combinations, keep a list of matches and just use this
  during combinations
  - for inserted target words (ones included as a string in
  production) there is only one choice and possible FS
  - only create wordfss after successful fill/constraint

Fill and Yn-Yn Contraint pass
  - Cache lower tree fs results?
    - probably, but only on one transfer (excluding lex) tree, then clear
    - also need to have same direct ancestors to be considered equal
    cache-wise? probably
  - recursive method should also pass ancestors
  - make tconstituents, tarcs class variables?

Adding new rule 
  - need to add to srclexicon and rulefinder after processing whole
rule so we know what parse group it belongs to


* May 2, 2003
  - Fixed bugs in load from init file
  - Fixed bugs in parsing and reading in transfer grammar
  - able to read in Kathrin's grammar/lexicon and parse a sentence
  - TODO: need more intelligent comparison of parse equation blocks for parsegroups
  - TODO: back-tracking! Done!


Chinese -> English Test Set (10 sentences)
 old : 1.62 seconds
 new : 0.54 seconds

Tests for accuracy with 
 1. compounds
 2. transfer
 3. words inserted based on feature struct
 4. words inserted as a string
 5. features passed up target tree
 6. Y-side agreement constraints
 7. compositionality
 8. Have source literals in parse rules
All passed!  Actually better than old xfer since
some logic bugs fixed.

English to German set (9 sentences)
 old : 321.92 seconds
 new :   9.59 seconds

Yet to test:
 Helper functions
 1. Rule fails
 2. Rule watches
 3. Attempt filters
 4. Partial translations
 5. Deleting rules

Still to do:
 Parsing (left hooks for all these to add in later)
 1. Put morphology back in (thought we'd be switching)
 2. Deal with unknown words (add as one of open classes)
 Wish-list
 1. Add server functionality to let it work with graphical interface


June 26:
 - Put in ordering for constituents before lex entries in partial translations
 - Took out dupe checking

June 27:
 - Took out constituent before lex order for partial translations

June 28:
 - Added simple Unicode lowercase (only handles ASCII range)
 - Priority example  (V unifies with N first)


July 23, 2003:

How to get better translations out sooner?
How to improve best partial output?

Aug. 26-27, 2003:
 Added trace output, so final target tree can be returned
 Fixed a bug where lexrule was getting set wrong, added separate tlexrule to fix problem