project - resources
and government administrators need a variety of navigation aids and analysis
tools to help them understand the contents of large public comment databases.
These aids and tools include full-text search, automatic construction
of browsing hierarchies, frequency analysis of discussion topics, and
summarization of similar comments, as well as more complex analysis tools
that identify stakeholder communities represented in a set of comments.
Below is an initial attempt to provide some of these tools as applied to
the collection of 20,936 public comments from
the USDA's National Organic Program.
statistics: these are general statistics gathered
the USDA corpus.
number of comments: 20,936
length of comments: 176 words
size of comments: 1,106 bytes
size of collection: 23,172,877 bytes
Updated on April 21, 2003.
hierarchy: hierarchy of the stakeholders. These extracted stakeholders
are placed in a hierarchy which is intended to help the user quickly get a
layout of what individuals and groups commented on the regulation. The
hierarchy is ordered by the frequency of each stakeholder (the number following
each stakeholder). Note, your browser must be Java-enabled to view the hierarchy.