This README describes data in the CMU Book Summary Corpus, a collection of 16,559 book plot summaries extracted from Wikipedia, along with aligned metadata from Freebase, including book author, title, and genre.

All data is released under a Creative Commons Attribution-ShareAlike License. For questions or comments, please contact David Bamman (dbamman@cs.cmu.edu).

###
#
# DATA
#
###

booksummaries.txt

Plot summaries of 16,559 books extracted from the November 2, 2012 dump of English-language Wikipedia.  Tab-separated; columns:

1. Wikipedia article ID
2. Freebase ID
3. Book title
4. Author
5. Publication date
6. Book genres (Freebase ID:name tuples)
7. Plot summary
