|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.htmlparser.nodeDecorators.AbstractNodeDecorator
Use either direct subclasses of the appropriate node and set them on the
PrototypicalNodeFactory,
or use a dynamic proxy implementing the required node type interface.
In the former case this avoids the wrapping and delegation, while the latter
case handles the wrapping and delegation without this class.
Here is an example of how to use dynamic proxies to accomplish the same effect as using decorators to wrap Text nodes:
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;
import org.htmlparser.Parser;
import org.htmlparser.PrototypicalNodeFactory;
import org.htmlparser.Text;
import org.htmlparser.nodes.TextNode;
import org.htmlparser.util.ParserException;
public class TextProxy
implements
InvocationHandler
{
protected Object mObject;
public static Object newInstance (Object object)
{
Class cls;
cls = object.getClass ();
return (Proxy.newProxyInstance (
cls.getClassLoader (),
cls.getInterfaces (),
new TextProxy (object)));
}
private TextProxy (Object object)
{
mObject = object;
}
public Object invoke (Object proxy, Method m, Object[] args)
throws Throwable
{
Object result;
String name;
try
{
result = m.invoke (mObject, args);
name = m.getName ();
if (name.equals ("clone"))
result = newInstance (result); // wrap the cloned object
else if (name.equals ("doSemanticAction")) // or other methods
System.out.println (mObject); // do the needful on the TextNode
}
catch (InvocationTargetException e)
{
throw e.getTargetException ();
}
catch (Exception e)
{
throw new RuntimeException ("unexpected invocation exception: " +
e.getMessage());
}
finally
{
}
return (result);
}
public static void main (String[] args)
throws
ParserException
{
// create the wrapped text node and set it as the prototype
Text text = (Text) TextProxy.newInstance (new TextNode (null, 0, 0));
PrototypicalNodeFactory factory = new PrototypicalNodeFactory ();
factory.setTextPrototype (text);
// perform the parse
Parser parser = new Parser (args[0]);
parser.setNodeFactory (factory);
parser.parse (null);
}
}
Node wrapping base class.
| Field Summary | |
protected Text |
delegate
Deprecated. |
| Constructor Summary | |
protected |
AbstractNodeDecorator(Text delegate)
Deprecated. |
| Method Summary | |
void |
accept(NodeVisitor visitor)
Deprecated. Apply the visitor to this node. |
java.lang.Object |
clone()
Deprecated. Clone this object. |
void |
collectInto(NodeList list,
NodeFilter filter)
Deprecated. Collect this node and its child nodes into a list, provided the node satisfies the filtering criteria. |
void |
doSemanticAction()
Deprecated. Perform the meaning of this tag. |
boolean |
equals(java.lang.Object arg0)
Deprecated. |
NodeList |
getChildren()
Deprecated. Get the children of this node. |
int |
getEndPosition()
Deprecated. Gets the ending position of the node. |
Node |
getFirstChild()
Deprecated. Get the first child of this node. |
Node |
getLastChild()
Deprecated. Get the last child of this node. |
Node |
getNextSibling()
Deprecated. Get the next sibling to this node. |
Page |
getPage()
Deprecated. Get the page this node came from. |
Node |
getParent()
Deprecated. Get the parent of this node. |
Node |
getPreviousSibling()
Deprecated. Get the previous sibling to this node. |
int |
getStartPosition()
Deprecated. Gets the starting position of the node. |
java.lang.String |
getText()
Deprecated. Accesses the textual contents of the node. |
void |
setChildren(NodeList children)
Deprecated. Set the children of this node. |
void |
setEndPosition(int position)
Deprecated. Sets the ending position of the node. |
void |
setPage(Page page)
Deprecated. Set the page this node came from. |
void |
setParent(Node node)
Deprecated. Sets the parent of this node. |
void |
setStartPosition(int position)
Deprecated. Sets the starting position of the node. |
void |
setText(java.lang.String text)
Deprecated. Sets the contents of the node. |
java.lang.String |
toHtml()
Deprecated. Return the HTML for this node. |
java.lang.String |
toPlainTextString()
Deprecated. A string representation of the node. |
java.lang.String |
toString()
Deprecated. Return the string representation of the node. |
| Methods inherited from class java.lang.Object |
finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
protected Text delegate
| Constructor Detail |
protected AbstractNodeDecorator(Text delegate)
| Method Detail |
public java.lang.Object clone()
throws java.lang.CloneNotSupportedException
clone in interface Nodejava.lang.CloneNotSupportedException - This shouldn't be thrown since
the Node interface extends Cloneable.public void accept(NodeVisitor visitor)
Node
accept in interface Nodevisitor - The visitor to this node.
public void collectInto(NodeList list,
NodeFilter filter)
NodeThis mechanism allows powerful filtering code to be written very
easily, without bothering about collection of embedded tags separately.
e.g. when we try to get all the links on a page, it is not possible to
get it at the top-level, as many tags (like form tags), can contain
links embedded in them. We could get the links out by checking if the
current node is a CompositeTag, and going
through its children. So this method provides a convenient way to do
this.
Using collectInto(), programs get a lot shorter. Now, the code to extract all links from a page would look like:
NodeList list = new NodeList ();
NodeFilter filter = new TagNameFilter ("A");
for (NodeIterator e = parser.elements (); e.hasMoreNodes ();)
e.nextNode ().collectInto (list, filter);
Thus, list will hold all the link nodes, irrespective of how
deep the links are embedded.
Another way to accomplish the same objective is:
NodeList list = new NodeList ();
NodeFilter filter = new TagClassFilter (LinkTag.class);
for (NodeIterator e = parser.elements (); e.hasMoreNodes ();)
e.nextNode ().collectInto (list, filter);
This is slightly less specific because the LinkTag class may be
registered for more than one node name, e.g. <LINK> tags too.
collectInto in interface Nodelist - The list to collect nodes into.filter - The criteria to use when deciding if a node should
be added to the list.public int getStartPosition()
getStartPosition in interface NodeNode.setStartPosition(int)public void setStartPosition(int position)
setStartPosition in interface Nodeposition - The new start position.Node.getStartPosition()public int getEndPosition()
getEndPosition in interface NodeNode.setEndPosition(int)public void setEndPosition(int position)
setEndPosition in interface Nodeposition - The new end position.Node.getEndPosition()public Page getPage()
getPage in interface NodeNode.setPage(org.htmlparser.lexer.Page)public void setPage(Page page)
setPage in interface Nodepage - The page that supplied this node.Node.getPage()public boolean equals(java.lang.Object arg0)
public Node getParent()
NodeLexer.
Currently, the object returned from this method can be safely cast to a
CompositeTag, but this behaviour should not
be expected in the future.
getParent in interface Nodenull
otherwise.Node.setParent(org.htmlparser.Node)public java.lang.String getText()
Text
getText in interface TextText.setText(java.lang.String)public void setParent(Node node)
Node
setParent in interface Nodenode - The node that contains this node.Node.getParent()public NodeList getChildren()
getChildren in interface Nodenull otherwise.Node.setChildren(org.htmlparser.util.NodeList)public void setChildren(NodeList children)
setChildren in interface Nodechildren - The new list of children this node contains.Node.getChildren()public Node getFirstChild()
Node
getFirstChild in interface Nodenull otherwise.public Node getLastChild()
Node
getLastChild in interface Nodenull otherwise.public Node getPreviousSibling()
Node
getPreviousSibling in interface Nodenull otherwise.public Node getNextSibling()
Node
getNextSibling in interface Nodenull otherwise.public void setText(java.lang.String text)
Text
setText in interface Texttext - The new text for the node.Text.getText()public java.lang.String toHtml()
Node
toHtml in interface Nodepublic java.lang.String toPlainTextString()
Node
for (Enumeration e = parser.elements (); e.hasMoreElements ();)
// or do whatever processing you wish with the plain text string
System.out.println ((Node)e.nextElement ()).toPlainTextString ());
toPlainTextString in interface Nodepublic java.lang.String toString()
NodeSystem.out.println (node);or within a debugging environment.
toString in interface Node
public void doSemanticAction()
throws ParserException
NodeNode.getChildren().
doSemanticAction in interface NodeParserException - If a problem is encountered performing the
semantic action.
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||