JAWJAW (JAva Wrapper for JApanese Wordnet) is a Java API for Japanese WordNet (wn-ja) database, which offers access to lexical knowledge of a given word, such as hypernym, hyponym, definition, translation (English <--> Japanese).
It's an API that hides the wn-ja DB schematic details from the programmer side. We provide both the simple API for general Java programmers and more fine-grained API for Natural Language Processing (NLP) application developers.
JAWJAW: Java Wrapper for Japanese WordNet
日本語 | English
Last modified: 2011-10-06
Introduction
Simple API
Just call methods in the façade class. You can see the list of available methods here.
Sample code:
Output:
Sample code:
public class SimpleDemo {
private static void run( String word, POS pos ) {
// Accessing Japanese WordNet from the façade class called JAWJAW
Set<String> hypernyms = JAWJAW.findHypernyms(word, pos);
Set<String> hyponyms = JAWJAW.findHyponyms(word, pos);
Set<String> consequents = JAWJAW.findEntailments(word, pos);
Set<String> translations = JAWJAW.findTranslations(word, pos);
Set<String> definitions = JAWJAW.findDefinitions(word, pos);
// Showing results. (note: polysemies are mixed up here)
System.out.println( "hypernyms of "+word+" : \t"+ hypernyms );
System.out.println( "hyponyms of "+word+" : \t"+ hyponyms );
System.out.println( word+" entails : \t\t"+ consequents );
System.out.println( "translations of "+word+" : \t"+ translations );
System.out.println( "definitions of "+word+" : \t"+ definitions );
}
public static void main(String[] args) {
// Showing a demo for "買収"(verb) which means to acquire
SimpleDemo.run( "買収", POS.v );
}
}
Output:
API for NLP Application Developers
In this API, you can get the raw content from the DB through DAO (Data Access Objects).
Data model:
Here's the domain model diagram generated from the Japanese WordNet DB schema. The API provides each data class and its DAO. Domain attributes "pos", "link" and "lang" are implmented as Enum class.
Available concept relationships:
Here's a summary of concept relationship "links" stored in the synlink table. (As of wn-ja v0.9)
Total number of concepts/words are:
Sample code:
Output:
Data model:
Available concept relationships:
Here's a summary of concept relationship "links" stored in the synlink table. (As of wn-ja v0.9)
| link | link description | # |
| also | See also | 2692 |
| syns | Synonyms | 0 |
| hype | Hypernyms | 89089 |
| inst | Instances | 8577 |
| hypo | Hyponym | 89089 |
| hasi | Has Instance | 8577 |
| mero | Meronyms | 0 |
| mmem | Meronyms --- Member | 12293 |
| msub | Meronyms --- Substance | 979 |
| mprt | Meronyms --- Part | 9097 |
| holo | Holonyms | 0 |
| hmem | Holonyms --- Member | 12293 |
| hsub | Holonyms --- Substance | 797 |
| hprt | Holonyms -- Part | 9097 |
| attr | Attributes | 1278 |
| sim | Similar to | 21386 |
| enta | Entails | 408 |
| caus | Causes | 220 |
| dmnc | Domain --- Category | 6643 |
| dmnu | Domain --- Usage | 967 |
| dmnr | Domain --- Region | 1345 |
| dmtc | In Domain --- Category | 6643 |
| dmtu | In Domain --- Usage | 967 |
| dmtr | In Domain --- Region | 1345 |
| ants | Antonyms | 0 |
Total number of concepts/words are:
- 49,190 concepts (called synsets in WordNet)
- 85,966 words
- 156,684 word definitions (pairs of word and synset)
Sample code:
public class AdvancedDemo {
private static void run( String word, POS pos ) {
// Access the Japanese WordNet DB and process the raw data
List<Word> words = WordDAO.findWordsByLemmaAndPos(word, pos);
List<Sense> senses = SenseDAO.findSensesByWordid( words.get(0).getWordid() );
String synsetId = senses.get(0).getSynset();
Synset synset = SynsetDAO.findSynsetBySynset( synsetId );
SynsetDef synsetDef = SynsetDefDAO.findSynsetDefBySynsetAndLang(synsetId, Lang.eng);
List<Synlink> synlinks = SynlinkDAO.findSynlinksBySynset( synsetId );
// Showing the result
System.out.println( words.get(0) );
System.out.println( senses.get(0) );
System.out.println( synset );
System.out.println( synsetDef );
System.out.println( synlinks.get(0) );
}
public static void main(String[] args) {
// Showing a demo for "自然言語処理"(noun) which means NLP
AdvancedDemo.run( "自然言語処理", POS.n );
}
}
Output:
Javadoc
Refer to this page.
Download
Download the latest version from here. (First version released on Mar 23, 2009; v1.0.0 released on Ocr 6, 2011. License: Apache License, Version 2.0)
How to use
Download the DB from Japanese WordNet website and put it under the data directory, e.g. "src/main/resources/wnjpn-0.9.db". It works on JDK 5 or later. To compile and get libraries (i.e. sqlite-jdbc-3.6.11.jar, nestedvm-1.0.jar, junit-4.0.jar), we recommend you use Maven2. With the provided pom.xml file, you can easily compile and solve dependencies with "mvn compile" and sanity-check the code with "mvn test".
Future works
Metrics for semantic similarity/distance between two synsetsReleased WS4J (WordNet Similarity for Java)- Command line interface
- Web interface
Contact
Hideki Shima at Carnegie Mellon University
Email: hideki at cs.cmu.edu
Email: hideki at cs.cmu.edu