This is a list of useful papers.
  1. "Turning to the masters: motion capturing cartoons"
    http://movement.stanford.edu/tooncap/

    @inproceedings{bregler_2002,
     author = {Christoph Bregler and Lorie Loeb and Erika Chuang and Hrishi Deshpande},
     title = {Turning to the masters: motion capturing cartoons},
     booktitle = {SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques},
     year = {2002},
     isbn = {1-58113-521-1},
     pages = {399--407},
     location = {San Antonio, Texas},
     doi = {http://doi.acm.org/10.1145/566570.566595},
     publisher = {ACM Press},
     address = {New York, NY, USA},
     }
    

    Summary notes: Animation = visual style + motion style. The motion style can be captured by learning the transformation parameters from a set of key-frames to every other frame that makes up the animation. Having chosen K keyframes, every intermediate frame is a warped version of a linear combination of the K keyframes. Given K new keyframes and the parameters for the original motion style (warp+combination weights) for each intermediate frame, new intermediate frames can be created. Input needed - cartoon contours for each of the keyshapes + cartoon contours for each of intermediate frames for the original sequence.

  2. "Animal gaits from video"
    Project webpage

    @inproceedings{favreau_2004,
     author = {Laurent Favreau and Lionel Reveret and Christine Depraz and Marie-Paule Cani},
     title = {Animal gaits from video},
     booktitle = {SCA '04: Proceedings of the 2004 ACM SIGGRAPH/Eurographics symposium on Computer animation},
     year = {2004},
     isbn = {3-905673-14-2},
     pages = {277--286},
     location = {Grenoble, France},
     doi = {http://doi.acm.org/10.1145/1028523.1028560},
     publisher = {Eurographics Association},
     address = {Aire-la-Ville, Switzerland, Switzerland},
    }
    

    Summary notes: Cites Bregler(2002). Aim to animate 3D models of animals from live video sequences. Extract 'the cheetah type of run' from long video sequences through PCA and then animate a 3D mesh model of a cheetah using those components. We don't deal with cyclical motion. The method in this paper will work only for cyclical movement.

  3. "An example-based approach for facial expression cloning"
    www.cs.umsl.edu/~kang/Papers/kang_sca03.pdf

    @inproceedings{pyun_06,
     author = {Hyewon Pyun and Yejin Kim and Wonseok Chae and Hyung Woo Kang and Sung Yong Shin},
     title = {An example-based approach for facial expression cloning},
     booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Courses},
     year = {2006},
     isbn = {1-59593-364-6},
     pages = {23},
     location = {Boston, Massachusetts},
     doi = {http://doi.acm.org/10.1145/1185657.1185863},
     publisher = {ACM Press},
     address = {New York, NY, USA},
    }
    

    Summary notes: Cites Bregler(2002).
     Input= 3D facial animation for a source model
     Output= similar animation for target model
    therefore, well described key-frames are required for this approach. see references 12,24,6 for "transferring parameters from source space(2D videos) to the target space(3D videos).."

  4. "A sketching interface for articulated figure animation"
    http://graphics.stanford.edu/papers/sketch_interface/

    @inproceedings{davis_2003,
     author = {James Davis and Maneesh Agrawala and Erika Chuang and Zoran Popovi\&\#263; and David Salesin},
     title = {A sketching interface for articulated figure animation},
     booktitle = {SCA '03: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation},
     year = {2003},
     isbn = {1-58113-659-5},
     pages = {320--328},
     location = {San Diego, California},
     publisher = {Eurographics Association},
     address = {Aire-la-Ville, Switzerland, Switzerland},
    }
    

    Summary notes: Create 3D articulated animation from 2D sketches of the character. Construct all possible 3D configurations, apply some constraints and some heuristics to return the most likely 3D pose. Focus on reconstructing keyframes, not interpolating between them (this complements the Bregler2002 approach, which focusses on the interpolation.)
     Input = artist's sketch overlaid with stick figures.
     use the skeleton to extract 2D locations of joints and bones- use locations to reconstruct 3D pose.
     The system has a set of default assumptions and an interface allowing user guidance.
     see Hecker and Perlin - sketch based animation system with touch sensitive tablet.
     Pose reconstruction - approach 1: 3D skeleton is known a priori, user hand-specifies 3D pose for the very first 2D frame and reinitializes if system wanders later on. The system uses optimization techniques to track the 2D image features and find a set of 3D skeletal joint angles to match.     approach 2: automatically learn mapping between 2D image features and 3D poses. This is done using a large training dataset where the 2d to 3D correspondence is already known.

  5. "Stylizing motion with drawings"
    http://www.cs.wisc.edu/graphics/list/list2.py?GraphicsWeb/video.html

    @inproceedings{li_2003,
     author = {Yin Li and Michael Gleicher and Ying-Qing Xu and Heung-Yeung Shum},
     title = {Stylizing motion with drawings},
     booktitle = {SCA '03: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation},
     year = {2003},
     isbn = {1-58113-659-5},
     pages = {309--319},
     location = {San Diego, California},
     publisher = {Eurographics Association},
     address = {Aire-la-Ville, Switzerland, Switzerland},
    }
    

    Summary notes: Given an input 3D animation sequence (sufficiently realistic, eg. motion captured) called 'renderings', and some artist drawn expressive keyframes called 'examples', how can the renderings be deformed so as to incorporate the example frames in the correct position in the resulting animation?
     Input = renderings, example frames, point correspondences for example frames and associated renderings, skeleton
     Output = new sequence which contains the example image and a seamless transition between preceding and subsequent renderings
     The 3D pose that leads to the example keyframes is determined by manually adjusting the root/limbs etc so that the resulting rendering matches the example the best it can- not use pose optimization because deformations are large. Point correspondences are extended to get curve correspondences using silhouettes and edges. Now the image is warped using SSD for the correct skinning.

  6. "Motion doodles: an interface for sketching character motion"
    http://www.cs.ubc.ca/~van/papers/doodle.html

    @inproceedings{thorne_2004,
     author = {Matthew Thorne and David Burke and Michiel van de Panne},
     title = {Motion doodles: an interface for sketching character motion},
     booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers},
     year = {2004},
     pages = {424--431},
     location = {Los Angeles, California},
     doi = {http://doi.acm.org/10.1145/1186562.1015740},
     publisher = {ACM Press},
     address = {New York, NY, USA},
    }
    

    Summary notes: Given a sketch of a simple articulated character, and a 'motion', this is a system that will animate the character accordingly. The articulated character is limited to some things like relative lengths of limbs, roughly humanoid looking, drawing in 7 pen strokes etc. The 'motion' is specified using a motion vocabulary - such and such curve means walking or doing a backflip etc - the vocabulary is well-defined and restricted but easy enough to pick up. A major task here is to convert penstrokes to motion primitives, extract parameters like the intended height of the jump, the landing point in real 3D coordinates etc. After that comes IK etc.
    Note to self: see articulated figure recognition: Forsyth and Fleck 1997, Teichmann and Teller 1998, Davis et al 2003, gesture recognition: Rubine 1991, Hammond and Davis 2003
    Google computer puppetry?

  7. "Automatic Detection of Human Nudes"
    http://www.eecs.berkeley.edu/Research/Projects/CS/vision/human/index.html

    @article{forsyth_2004,
    	title = {Automatic Detection of Human Nudes},
    	author = {D.A. Forsyth and M.M. Fleck},
    	journal = {International Journal of Computer Vision},
    	year = {1999},
    	month = {November},
    	pages = {63-77},
    	volume = {32},
    	number = {1},
    	doi = {10.1023/A:1008145029462},
    }
    

    Summary notes: Automatic system for detecting if human nudes are present in an image. How it is done:

    1. Look for large areas of skin-coloured pixels - I am not interested in this.
    2. Find regions that are similar to the projection of cylinders - may be useful
    3. Group into possible human limbs and connected groups of limbs- may be useful

    Note to self: see Kriegman and Ponce 1990 PAMI - curved 3D objects from image contours

  8. "Synthesis of complex dynamic character motion from simple animations"
    http://grail.cs.washington.edu/projects/charanim/

    @inproceedings{liu_2002,
     author = {C. Karen Liu and Zoran Popovi\&\#263;},
     title = {Synthesis of complex dynamic character motion from simple animations},
     booktitle = {SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques},
     year = {2002},
     isbn = {1-58113-521-1},
     pages = {408--416},
     location = {San Antonio, Texas},
     doi = {http://doi.acm.org/10.1145/566570.566596},
     publisher = {ACM Press},
     address = {New York, NY, USA}
    }
    

    Summary notes: rapid prototyping of realistic character motion. Given input character motion (by an animator), detect constraints and generate realistic motion that satisfies the constraints and conserves momentum. Note the emphasis on realism - note how the Disney books say that the animated characters must look like they have weight etc, but must exaggerate/break rules of physics to convey the impression of 'life'.

  9. "Evaluating video-based motion capture"
    Michael Gleicher's papers

    @misc{ gleicher02evaluating,
      author = "M. Gleicher and N. Ferrier",
      title = "Evaluating video-based motion capture",
      text = "M. Gleicher and N. Ferrier. Evaluating video-based motion capture. In Proceedings
        of Computer Animation 2002.",
      year = "2002",
      url = "citeseer.ist.psu.edu/gleicher02evaluating.html" 
    }
    

    Summary notes: reconstructing 3D motion given image observations is an ongoing research problem!
    The motion capture problem: Given a single stream of video observations of a performer, compute a 3D skeletal representation of the motion of sufficient quality to be useful for animation.

    Challenges posed by animation as an end: Noise is high frequency, but a LPF is not the solution because crispness of motion, sudden actions etc are high frequency too. Data driven approaches to removing noise are not useful because what you want is 'this particular performance' and not the generic qualities of that class of performances. Using video as mocap- 2D to 3D is ambiguous. Since the idea is to mocap novel motions, previous data can not be used as cues.

    Constraint based approach: Finding restrictions on the pose- each new observation narrows the space of possible poses. Section 3.1 - they say that they wish to avoid the most likely pose because they wish to capture novel poses - err my comment- if your elbow is here and wrist is there, there is only one place your arm can be- choosing the most likely sub-pose based on criteria such as connectedness or temporal nearness may be the way to use MLE, instead of choosing the most likely whole pose based on previously seen data.
    Sample problem they tackled: Given an object with a known rigid kinematic tree, known initial pose and image observations of joints in subsequent frames, find the best pose(i.e. 3D location of the joint points) that satisfies these constraints . The skeletal model is not used for tracking - avoid using strong models. The skeletal model IS used for reconstruction. Pose reconstruction is shown for a karate kick- the constraint solver chooses a plausible pose that is, alas, incorrect.
    Last section has some good references to vision papers that deal with tracking.
    Question: What is 'Timelines, our constraint based animation test bed' ?

    Note to self: see shadow motion[1], Bregler and Malik CVPR 1998: Tracking people with twists and exponential maps, Thomas Moeslund: Summaries of 107 computer vision based human motion capture papers (Tech report, Aalborg, 1999) "

  10. "Layered acting for character animation"
    http://grail.cs.washington.edu/projects/rapid-anim/

    @article{dontcheva_2003,
     author = {Mira Dontcheva and Gary Yngve and Zoran Popovi\&\#263;},
     title = {Layered acting for character animation},
     journal = {ACM Trans. Graph.},
     volume = {22},
     number = {3},
     year = {2003},
     issn = {0730-0301},
     pages = {409--416},
     doi = {http://doi.acm.org/10.1145/882262.882285},
     publisher = {ACM Press},
     address = {New York, NY, USA},
    }
    

    Summary notes: animator 'acts' out motion, is mocap'd and the mocap data converted to animated character- coolness-> done real time, multiple layers so you can create a six legged creature etc.

  11. "A Survey of Computer Vision-Based Human Motion Capture (2001)"

    @article{ moeslund01survey,
        author = "Thomas B. Moeslund and Erik Granum",
        title = "A Survey of Computer Vision-Based Human Motion Capture",
        journal = "Computer Vision and Image Understanding: CVIU",
        volume = "81",
        number = "3",
        pages = "231--268",
        year = "2001",
        url = "citeseer.ist.psu.edu/moeslund01survey.html" 
    }
    

  12. "Summaries of 107 computer vision-based human motion capture papers"

    @misc{ moeslund99summaries,
      author = "T. Moeslund",
      title = "Summaries of 107 computer vision-based human motion capture papers",
      text = "Moeslund, T. Summaries of 107 computer vision-based human motion capture
        papers. University of Aalborg Technical Report LIA 99-01, March 1999.",
      year = "1999",
      url = "citeseer.ist.psu.edu/moeslund99summaries.html" 
    }
    

  13. "Computer puppetry: An importance-based approach"

    @article{shin_2001,
        author = "Hyun Joon Shin and Jehee Lee and Sung Yong Shin and Michael Gleicher",
        title = "Computer puppetry: An importance-based approach",
        journal = "ACM Transactions on Graphics",
        volume = "20",
        number = "2",
        pages = "67-94",
        year = "2001",
        url = "citeseer.ist.psu.edu/shin01computer.html" 
    }
    

  14. "Computer Puppetry"

    @article{sturman_1998,
     author = {David J. Sturman},
     title = {Computer Puppetry},
     journal = {IEEE Comput. Graph. Appl.},
     volume = {18},
     number = {1},
     year = {1998},
     issn = {0272-1716},
     pages = {38--45},
     doi = {http://dx.doi.org/10.1109/38.637269},
     publisher = {IEEE Computer Society Press},
     address = {Los Alamitos, CA, USA},
     }
    

  15. Notes: computer puppetry
    http://www.thepuppetstudio.com/What.html#computer

  16. "Recovery of 3D articulated models from 2D correspondences"
    http://staffx.webstore.ntu.edu.sg/personal/astjcham/Web/Research/percepter.htm#3d_recon

    @inproceedings{rehg:2001,
      author = "D.E. DiFranco, T.J. Cham, J.M. Rehg",
      title = "Recovery of 3D articulated models from 2D correspondences",
      booktitle = "Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
      year = "2001",
      pages = "307-314",
      volume = "1",
      note = "3D Recovery of Human Motion from Monocular Video
    		# a 3D kinematic model (to specify 3D connectedness of limbs)
    		# joint-angle limits (e.g. to specify limits of elbow and knee rotation)
    		# dynamic motion model (to enforce temporal smoothness of motion)
    		# user-specified 3D key frames (allows users to aid estimation framework, e.g. by providing boundary conditions).
    	  "
    }
    

  17. "Free-viewpoint video of human actors"
    http://portal.acm.org/citation.cfm?id=882309&dl=ACM&coll=portal

    @inproceedings{carranza:2003,
     author = {Joel Carranza and Christian Theobalt and Marcus A. Magnor and Hans-Peter Seidel},
     title = {Free-viewpoint video of human actors},
     booktitle = {SIGGRAPH '03: ACM SIGGRAPH 2003 Papers},
     year = {2003},
     isbn = {1-58113-709-5},
     pages = {569--577},
     location = {San Diego, California},
     doi = {http://doi.acm.org/10.1145/1201775.882309},
     publisher = {ACM Press},
     address = {New York, NY, USA},
     note = {Need to extract pose from silhouette. Energy function for optimizer is the sum of the bits when we take the EX-OR of image silhouette with rendered pose silhouette. Optimization is done in four stages - a) root position b) head and hip rotation c) arms d) legs. 
     	Have done it on a male ballet dancer.}
     }
    

  18. "Interactive control of avatars animated with human motion data"
    http://portal.acm.org/citation.cfm?id=566607&coll=portal&dl=ACM&CFID=27608835&CFTOKEN=65462853

    @inproceedings{lee:2002,
     author = {Jehee Lee and Jinxiang Chai and Paul S. A. Reitsma and Jessica K. Hodgins and Nancy S. Pollard},
     title = {Interactive control of avatars animated with human motion data},
     booktitle = {SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques},
     year = {2002},
     isbn = {1-58113-521-1},
     pages = {491--500},
     location = {San Antonio, Texas},
     doi = {http://doi.acm.org/10.1145/566570.566607},
     publisher = {ACM Press},
     address = {New York, NY, USA},
     note = {Take silhouette of image, compute first 7 Hu moments and use these to discriminate shape. Along with Hu moments, use the coordinates of the center of the silhouette.}
     }
    

  19. "Learning silhouette features for control of human motion"
    http://graphics.cs.cmu.edu/projects/swing/

    @article{ ren_2004,
     author = {Liu Ren and Gregory Shakhnarovich and Jessica K. Hodgins and Hanspeter Pfister and Paul Viola},
     title = {Learning silhouette features for control of human motion},
     journal = {ACM Trans. Graph.},
     volume = {24},
     number = {4},
     year = {2005},
     issn = {0730-0301},
     pages = {1303--1331},
     doi = {http://doi.acm.org/10.1145/1095878.1095882},
     publisher = {ACM Press},
     address = {New York, NY, USA},
     note = {
    	 Three dimensional human motion tracking
    		 a) Yamamoto et al 1998
    		 b) Bregler and Malik 1998
    		 c) Delamarre and Faugeras 1999 - Three synchronized cameras to compute silhouettes - synthetic silhouetts compared to real ones.
    		 	The difference between silhouettes is used to generate "forces" which were then applied to the 3D articulated model to acheive a better match (iteratively? :-P)
    		 d) Sminchisescu and Triggs 2003
    	 In general, these systems assume accurate initialization and then track changes in pose based on an articulated 3D human model.
    
    	 Lee et al 2002 - Single camera vision interface, retrieved 3D motion data from the database by extracting silhouettes and comparing global features (Hu moments).
     	 Carranza et al 2003 - 3D reconstruction of body shape .
    	 The body configuration for a new silhouette is estimated by searching the database for a similar silhouette and retrieving the corresponding body configuration + changes in global position and orientation.
    	 VIOLA AND JONES 2001
    	 How to compare two silhouettes - take the hamming distance of binary feature vectors. 
    		# How is a feature vector created? By applying a filter to the silhouette and storing the response. 
    		# What is a filter? A black and white rectangle, centered at some postion on the silhouette; 
    		  the pixels of the black region are subtracted from the pixels of the white region. 
    		# How many filters can we possibly have? 
    		  As many as you choose - variables include the size of the filter, and the center positon. 
    		# How do we choose the most descriptive of all these filters? Adaboost (Each filter acts like a weak classifier).
    
     	}
     }
    
    

  20. "Estimation Algorithms for Ambiguous Visual Models--Three-Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences"
    http://citeseer.ist.psu.edu/sminchisescu02estimation.html/

    @phdthesis{ sminchisescu:2002,
      author = "C. Sminchisescu",
      title = "Estimation Algorithms for Ambiguous Visual Models--Three-Dimensional Human
        Modeling and Motion Reconstruction in Monocular Video Sequences",
      text = "C. Sminchisescu. Estimation Algorithms for Ambiguous Visual Models--Three-Dimensional
        Human Modeling and Motion Reconstruction in Monocular Video Sequences. PhD
        thesis, Institute National Politechnique de Grenoble (INRIA), July 2002.",
      year = "2002",
      url = "citeseer.ist.psu.edu/sminchisescu02estimation.html"
      note = "most comprehensive body of work on monocular tracking"
    }
    

  21. "Lumo: illumination for cel animation"
    http://portal.acm.org/citation.cfm?id=508538&dl=ACM&coll=portal&CFID=27609400&CFTOKEN=25015575#

    @inproceedings{johnston:2002,
     author = {Scott F. Johnston},
     title = {Lumo: illumination for cel animation},
     booktitle = {NPAR '02: Proceedings of the 2nd international symposium on Non-photorealistic animation and rendering},
     year = {2002},
     isbn = {1-58113-494-0},
     pages = {45--ff},
     location = {Annecy, France},
     doi = {http://doi.acm.org/10.1145/508530.508538},
     publisher = {ACM Press},
     address = {New York, NY, USA},
     }
    

  22. "Keyframe Animation using an Artist's Doll"
    http://www.soe.ucsc.edu/classes/cmps262/Winter07/projects/keyframe-animation/

    @misc{gunawardane:2007,
     author = {Prabath Gunawardane, Eddy Chandra, Tien-Chieng Jack Feng, James Davis},
     title = {Keyframe Animation using an Artist's Doll},
     booktitle = {SIGGRAPH '07 Sketches},
     year = {2007},
     location = {San Diego, U.S.A.},
     note = {Artist's doll with each joint coloured uniquely. Take two pictures of this doll once its pose has been set by the animator.
     Use stereo triangulation to determine exact (ha ha) 3D pose. Once two keyframes have been found, interpolate (in Maya) to get an animation.}
     }
    

  23. "Human Motion Perception: Does Actor Size Matter in Motion Capture"
    http://www.siggraph.org/s2007/attendees/posters/8.html

    @misc{celada:2007,
     author = {Pierfrancesco Celada, Sian Lawson},
     title = {Human Motion Perception: Does Actor Size Matter in Motion Capture},
     booktitle = {SIGGRAPH '07 Poster},
     year = {2007},
     location = {San Diego, U.S.A.},
     note = { Take motion captured data of actors of different sizes, transfer all motions to stick figures of a canonical size.
     Ask subjects to guess if the actors were large or small. 
     Subjects guessed correctly much of the time, and said they used cues like large people will need to move their joints by a smaller angle.
     }
     }
    

  24. (On the side, not siggraph. This one came up when I googled the previous one.)"Perception of Human Motion"
    http://arjournals.annualreviews.org/doi/abs/10.1146/annurev.psych.57.102904.190152

    @article{blake:2006,
     author = {Randolph Blake and Maggie Shiffrar},
     title = {Perception of Human Motion},
     journal = {Annual review of Psychology},
     year = {2006},
     volume = {58},
     pages = {47-73},
     note = { }
     }
    

  25. "Proprioceptive Sense in an Art Installation: Amputation Box"

    @misc{johnston:2007,
     author = {David J. Johnston, Jinsil Seo, Diane Gromala},
     title = {Proprioceptive Sense in an Art Installation: Amputation Box},
     booktitle = {SIGGRAPH '07 Poster},
     year = {2007},
     location = {San Diego, U.S.A.},
     note = { }
     }
    

    On a related note, SIGGRAPH'99 : Virtual Imaginations Require Real Bodies

  26. "Optimization-based Interactive Motion Synthesis for Virtual Characters"
    http://boomer.usc.edu/~sumit/papers/sig07/jain_phoward_sig07.pdf

    @misc{jain:2007,
     author = {Sumit Jain, Yutin Ye, C. Karen Liu},
     title = {Optimization-based Interactive Motion Synthesis for Virtual Characters},
     booktitle = {SIGGRAPH '07 Poster},
     year = {2007},
     location = {San Diego, U.S.A.},
     note = {Interesting dodge example.}
     }