Seki et al, NTCIR 2006
From ScribbleWiki: Analysis of Social Media
Overview of Opinion Analysis Pilot Task at NTCIR-6
In the task, they analyzed and evaluated following four aspects.
Given a sentence,
- Does it express an opinion? - binary classification of opinionated sentence
- Is it positive, negative or neutral statement? - polarity analysis
- Who expresses the opinon? - opinion holder extraction
- Is it relevant to the document set topic? - binary classification of relevance between topic and sentence
|Chinese||1998-1999 United Daily News, China Times etc||32||843||11,907||62% / 25 %||39% / 16 %|
|Japanese||1998-1999 Yomiuri and Mainichi||30||490||15,279||29% / 22 %||64% / 49 %|
|English||1998-1999 Mainichi Daily News, Korea Times etc||28||439||8,528||30% / 7 %||69% / 37 %|
The percentage opinionated and relevant are computed over sentences in both the lenient and strict standards based on the number of inter-annotater agreement.
Precision, Recall and F-Measure over opinionated, relevant and polarity. Semi-automatic evaluation of opinion holders (P, R, F)
For each Chinese, Japanese and English subtask, there were 5, 3 and 6 participants respectively. Performance across subtasks varies greatly. Considering the result of a system participated in multiple subtasks, there seems to be a strong relationship between quality of annotation (i.e. measured in inter-annotater agreement) and the system performance.