|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.htmlparser.tests.utilTests.CharacterTranslationTest.Generate
Create a character reference translation class source file. Usage:
java -classpath .:lib/htmlparser.jar Generate > Translate.java
Derived from HTMLStringFilter.java provided as an example with the
htmlparser.jar file available at
htmlparser.sourceforge.net
written by Somik Raha (
somik@industriallogic. com
http://industriallogic.com).
| Field Summary | |
protected Parser |
mParser
The working parser. |
protected java.lang.String |
nl
|
| Constructor Summary | |
CharacterTranslationTest.Generate()
Create a Generate object. |
|
| Method Summary | |
void |
extract(java.lang.String string,
java.io.PrintWriter out)
Parse the sgml declaration for character entity reference name, equivalent numeric character reference and a comment. |
void |
gather(Node node,
java.lang.StringBuffer buffer)
|
int |
indexOfWhitespace(java.lang.String string,
int index)
Find the lowest index of whitespace (space or newline). |
java.lang.String |
pack(java.lang.String string)
Rewrite the comment string. |
java.lang.String |
pad(java.lang.String string,
char character,
int length)
Pad a string on the left with the given character to the length specified. |
void |
parse(java.io.PrintWriter out)
Pull out text elements from the HTML. |
java.lang.String |
pretty(java.lang.String string)
Pretty up a comment string. |
void |
sgml(java.lang.String string,
java.io.PrintWriter out)
Extract special characters. |
java.lang.String |
translate(java.lang.String string)
Translate character references. |
java.lang.String |
unicode(java.lang.String string)
Convert the textual representation of the numeric character reference to a character. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
protected Parser mParser
protected java.lang.String nl
| Constructor Detail |
public CharacterTranslationTest.Generate()
throws ParserException
Parser pointed
at http://www.w3.org/TR/REC-html40/sgml/entities.html
with the standard scanners registered.
| Method Detail |
public java.lang.String translate(java.lang.String string)
string - The raw string.
public void gather(Node node,
java.lang.StringBuffer buffer)
public int indexOfWhitespace(java.lang.String string,
int index)
string - The string to look in.index - Where to start looking.
public java.lang.String pack(java.lang.String string)
-- latin capital letter I with diaeresis,
U+00CF ISOlat1
so we just want to make a one-liner without the spaces and newlines.
string - The raw comment.
public java.lang.String pretty(java.lang.String string)
string - The comment to operate on.
public java.lang.String pad(java.lang.String string,
char character,
int length)
string - The string to padcharacter - The character to pad with.length - The size to pad to.
public java.lang.String unicode(java.lang.String string)
string - The numeric character reference (in quotes).
public void extract(java.lang.String string,
java.io.PrintWriter out)
string - The contents of the sgml declaration.out - The sink for output.
public void sgml(java.lang.String string,
java.io.PrintWriter out)
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space, U+00A0 ISOnum -->and emit a java definition for each.
string - The raw string from w3.org.out - The sink for output.
public void parse(java.io.PrintWriter out)
throws ParserException
out - The sink for output.
ParserException
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||