StringExtractor

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.htmlparser.parserapplications
Class StringExtractor

java.lang.Object
  org.htmlparser.parserapplications.StringExtractor

public class StringExtractor
extends java.lang.Object

Extract plaintext strings from a web page. Illustrative program to gather the textual contents of a web page. Uses a StringBean to accumulate the user visible text (what a browser would display) into a single string.

Constructor Summary
`StringExtractor(java.lang.String resource)` Construct a StringExtractor to read from the given resource.

Method Summary
`java.lang.String`	`extractStrings(boolean links)` Extract the text from a page.
`static void`	`main(java.lang.String[] args)` Mainline.

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

StringExtractor

public StringExtractor(java.lang.String resource)

Construct a StringExtractor to read from the given resource.
Parameters:: resource - Either a URL or a file name.

Method Detail

extractStrings

public java.lang.String extractStrings(boolean links)
                                throws ParserException

Extract the text from a page.

Parameters:: links - if true include hyperlinks in output.
Returns:: The textual contents of the page.
Throws:: ParserException - If a parse error occurs.

main

public static void main(java.lang.String[] args)

Mainline.

Parameters:: args - The command line arguments.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.htmlparser.parserapplications Class StringExtractor

StringExtractor

extractStrings

main

org.htmlparser.parserapplications
Class StringExtractor