websphinx
Class Tag

java.lang.Object
  |
  +--websphinx.Region
        |
        +--websphinx.Tag

public class Tag
extends Region

Tag in an HTML page.


Field Summary
static java.lang.String A
          Commonly useful tag names.
static java.lang.String ABBREV
           
static java.lang.String ACRONYM
           
static java.lang.String ADDRESS
           
static java.lang.String APPLET
           
static java.lang.String AREA
           
static java.lang.String B
           
static java.lang.String BASE
           
static java.lang.String BASEFONT
           
static java.lang.String BDO
           
static java.lang.String BGSOUND
           
static java.lang.String BIG
           
static java.lang.String BLINK
           
static java.lang.String BLOCKQUOTE
           
static java.lang.String BODY
           
static java.lang.String BR
           
static java.lang.String CAPTION
           
static java.lang.String CENTER
           
static java.lang.String CITE
           
static java.lang.String CODE
           
static java.lang.String COL
           
static java.lang.String COLGROUP
           
static java.lang.String COMMENT
           
static java.lang.String DD
           
static java.lang.String DEL
           
static java.lang.String DFN
           
static java.lang.String DIR
           
static java.lang.String DIV
           
static java.lang.String DL
           
static java.lang.String DT
           
static java.lang.String EM
           
static java.lang.String EMBED
           
static java.lang.String FONT
           
static java.lang.String FORM
           
static java.lang.String FRAME
           
static java.lang.String FRAMESET
           
static java.lang.String H1
           
static java.lang.String H2
           
static java.lang.String H3
           
static java.lang.String H4
           
static java.lang.String H5
           
static java.lang.String H6
           
static java.lang.String HEAD
           
static java.lang.String HR
           
static java.lang.String HTML
           
static java.lang.String I
           
static java.lang.String IMG
           
static java.lang.String INPUT
           
static java.lang.String ISINDEX
           
static java.lang.String KBD
           
static java.lang.String LI
           
static java.lang.String LINK
           
static java.lang.String LISTING
           
static java.lang.String MAP
           
static java.lang.String MARQUEE
           
static int MAX_LENGTH
          Length of longest tag name.
static java.lang.String MENU
           
static java.lang.String META
           
static java.lang.String NEXTID
           
static java.lang.String NOBR
           
static java.lang.String NOEMBED
           
static java.lang.String NOFRAMES
           
static java.lang.String OBJECT
           
static java.lang.String OL
           
static java.lang.String OPTION
           
static java.lang.String P
           
static java.lang.String PARAM
           
static java.lang.String PLAINTEXT
           
static java.lang.String PRE
           
static java.lang.String SAMP
           
static java.lang.String SCRIPT
           
static java.lang.String SELECT
           
static java.lang.String SMALL
           
static java.lang.String SPACER
           
static java.lang.String STRIKE
           
static java.lang.String STRONG
           
static java.lang.String STYLE
           
static java.lang.String SUB
           
static java.lang.String SUP
           
static java.lang.String TABLE
           
static java.lang.String TD
           
static java.lang.String TEXTAREA
           
static java.lang.String TH
           
static java.lang.String TITLE
           
static java.lang.String TR
           
static java.lang.String TT
           
static java.lang.String U
           
static java.lang.String UL
           
static java.lang.String VAR
           
static java.lang.String WBR
           
static java.lang.String XMP
           
 
Fields inherited from class websphinx.Region
end, names, source, start, TRUE
 
Constructor Summary
Tag(Page page, int start, int end, java.lang.String tagName, boolean startTag)
          Make a Tag.
 
Method Summary
 int countHTMLAttributes()
          Get number of HTML attributes on this tag.
 java.util.Enumeration enumerateHTMLAttributes()
          Enumerate the HTML attributes found on this tag.
 Element getElement()
          Get element to which this tag is the start or end tag.
 java.lang.String getHTMLAttribute(java.lang.String name)
          Get an HTML attribute's value.
 java.lang.String getHTMLAttribute(java.lang.String name, java.lang.String defaultValue)
          Get an HTML attribute's value, with a default value if it doesn't exist.
 java.lang.String[] getHTMLAttributes()
          Get all the HTML attributes found on this tag.
 java.lang.String getTagName()
          Get tag name.
 boolean hasHTMLAttribute(java.lang.String name)
          Test if tag has an HTML attribute.
 boolean isBlockTag()
          Test if tag is a block-level tag.
 boolean isBodyTag()
          Test if tag belongs in the element.
 boolean isEndTag()
          Test if tag is an end tag.
 boolean isFlowTag()
          Test if tag is a flow-level tag.
 boolean isHeadTag()
          Test if tag belongs in the element.
 boolean isStartTag()
          Test if tag is a start tag.
 Tag removeHTMLAttribute(java.lang.String name)
          Copy this tag, removing an HTML attribute.
 Tag replaceHTMLAttribute(java.lang.String name)
          Copy this tag, setting an HTML attribute's value to TRUE.
 Tag replaceHTMLAttribute(java.lang.String name, java.lang.String value)
          Copy this tag, setting an HTML attribute's value.
static java.lang.String toHTMLAttributeName(java.lang.String name)
          Convert a String to an HTML attribute name.
static java.lang.String toTagName(java.lang.String name)
          Convert a String to a tag name.
 
Methods inherited from class websphinx.Region
enumerateObjectLabels, findEnd, findStart, getEnd, getField, getFields, getLabel, getLabel, getLength, getNumericLabel, getObjectLabel, getObjectLabels, getRootElement, getSource, getStart, hasAllLabels, hasAllLabels, hasAnyLabels, hasAnyLabels, hasLabel, removeLabel, setField, setFields, setLabel, setLabel, setObjectLabel, span, toHTML, toString, toTags, toText
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

A

public static final java.lang.String A
Commonly useful tag names. Derived from HTML Elements at Sandia National Labs.


ABBREV

public static final java.lang.String ABBREV

ACRONYM

public static final java.lang.String ACRONYM

ADDRESS

public static final java.lang.String ADDRESS

APPLET

public static final java.lang.String APPLET

AREA

public static final java.lang.String AREA

B

public static final java.lang.String B

BASE

public static final java.lang.String BASE

BASEFONT

public static final java.lang.String BASEFONT

BDO

public static final java.lang.String BDO

BGSOUND

public static final java.lang.String BGSOUND

BIG

public static final java.lang.String BIG

BLINK

public static final java.lang.String BLINK

BLOCKQUOTE

public static final java.lang.String BLOCKQUOTE

BODY

public static final java.lang.String BODY

BR

public static final java.lang.String BR

CAPTION

public static final java.lang.String CAPTION

CENTER

public static final java.lang.String CENTER

CITE

public static final java.lang.String CITE

CODE

public static final java.lang.String CODE

COL

public static final java.lang.String COL

COLGROUP

public static final java.lang.String COLGROUP

COMMENT

public static final java.lang.String COMMENT

DD

public static final java.lang.String DD

DEL

public static final java.lang.String DEL

DFN

public static final java.lang.String DFN

DIR

public static final java.lang.String DIR

DIV

public static final java.lang.String DIV

DL

public static final java.lang.String DL

DT

public static final java.lang.String DT

EM

public static final java.lang.String EM

EMBED

public static final java.lang.String EMBED

FONT

public static final java.lang.String FONT

FRAME

public static final java.lang.String FRAME

FRAMESET

public static final java.lang.String FRAMESET

FORM

public static final java.lang.String FORM

H1

public static final java.lang.String H1

H2

public static final java.lang.String H2

H3

public static final java.lang.String H3

H4

public static final java.lang.String H4

H5

public static final java.lang.String H5

H6

public static final java.lang.String H6

HEAD

public static final java.lang.String HEAD

HR

public static final java.lang.String HR

HTML

public static final java.lang.String HTML

I

public static final java.lang.String I

IMG

public static final java.lang.String IMG

INPUT

public static final java.lang.String INPUT

ISINDEX

public static final java.lang.String ISINDEX

KBD

public static final java.lang.String KBD

LI

public static final java.lang.String LI

LINK

public static final java.lang.String LINK

LISTING

public static final java.lang.String LISTING

MAP

public static final java.lang.String MAP

MARQUEE

public static final java.lang.String MARQUEE

MENU

public static final java.lang.String MENU

META

public static final java.lang.String META

NEXTID

public static final java.lang.String NEXTID

NOBR

public static final java.lang.String NOBR

NOEMBED

public static final java.lang.String NOEMBED

NOFRAMES

public static final java.lang.String NOFRAMES

OBJECT

public static final java.lang.String OBJECT

OL

public static final java.lang.String OL

OPTION

public static final java.lang.String OPTION

P

public static final java.lang.String P

PARAM

public static final java.lang.String PARAM

PLAINTEXT

public static final java.lang.String PLAINTEXT

PRE

public static final java.lang.String PRE

SAMP

public static final java.lang.String SAMP

SCRIPT

public static final java.lang.String SCRIPT

SELECT

public static final java.lang.String SELECT

SMALL

public static final java.lang.String SMALL

SPACER

public static final java.lang.String SPACER

STRIKE

public static final java.lang.String STRIKE

STRONG

public static final java.lang.String STRONG

STYLE

public static final java.lang.String STYLE

SUB

public static final java.lang.String SUB

SUP

public static final java.lang.String SUP

TABLE

public static final java.lang.String TABLE

TD

public static final java.lang.String TD

TEXTAREA

public static final java.lang.String TEXTAREA

TH

public static final java.lang.String TH

TITLE

public static final java.lang.String TITLE

TR

public static final java.lang.String TR

TT

public static final java.lang.String TT

U

public static final java.lang.String U

UL

public static final java.lang.String UL

VAR

public static final java.lang.String VAR

WBR

public static final java.lang.String WBR

XMP

public static final java.lang.String XMP

MAX_LENGTH

public static int MAX_LENGTH
Length of longest tag name.

Constructor Detail

Tag

public Tag(Page page,
           int start,
           int end,
           java.lang.String tagName,
           boolean startTag)
Make a Tag.

Parameters:
page - Page containing tag
start - Starting offset of tag in page
end - Ending offset of tag
tagName - Name of tag (like "p")
startTag - true for start tags (like "<p>"), false for end tags ("</p>")
Method Detail

getTagName

public java.lang.String getTagName()
Get tag name.

Returns:
tag name (like "p"), in lower-case, String.intern()'ed form.

getElement

public Element getElement()
Get element to which this tag is the start or end tag.

Returns:
element, or null if tag has no element.

toTagName

public static java.lang.String toTagName(java.lang.String name)
Convert a String to a tag name. Tag names are lower-case, intern()'ed Strings. Thus you can compare tag names with ==, as in: getTagName() == Tag.IMG.

Parameters:
name - Name to convert (e.g., "P")
Returns:
tag name (e.g. "p"), in lower-case, String.intern()'ed form.

isStartTag

public boolean isStartTag()
Test if tag is a start tag. Equivalent to !isEndTag().

Returns:
true if and only if tag is a start tag (like "<P>")

isEndTag

public boolean isEndTag()
Test if tag is an end tag. Equivalent to !isStartTag().

Returns:
true if and only if tag is a start tag (like "</P>")

isBlockTag

public boolean isBlockTag()
Test if tag is a block-level tag. Equivalent to !isFlowTag().

Returns:
true if and only if tag is a block-level tag (like "<P>")

isFlowTag

public boolean isFlowTag()
Test if tag is a flow-level tag. Equivalent to !isBlockTag().

Returns:
true if and only if tag is a block-level tag (like "<A>")

isHeadTag

public boolean isHeadTag()
Test if tag belongs in the element.

Returns:
true if and only if tag is a HEAD-level tag (like "<TITLE>")

isBodyTag

public boolean isBodyTag()
Test if tag belongs in the element.

Returns:
true if and only if tag is a BODY-level tag (like "<A>")

toHTMLAttributeName

public static java.lang.String toHTMLAttributeName(java.lang.String name)
Convert a String to an HTML attribute name. Attribute names are lower-case, intern()'ed Strings. Thus you can compare attribute names with ==.

Parameters:
name - Name to convert (e.g., "HREF")
Returns:
tag name (e.g. "href"), in lower-case, String.intern()'ed form.

hasHTMLAttribute

public boolean hasHTMLAttribute(java.lang.String name)
Test if tag has an HTML attribute.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
Returns:
true if tag has the attribute, false if not

getHTMLAttribute

public java.lang.String getHTMLAttribute(java.lang.String name)
Get an HTML attribute's value.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
Returns:
value of attribute if it exists, TRUE if the attribute exists but has no value, or null if tag lacks the attribute.

getHTMLAttribute

public java.lang.String getHTMLAttribute(java.lang.String name,
                                         java.lang.String defaultValue)
Get an HTML attribute's value, with a default value if it doesn't exist.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
defaultValue - default value to return if the attribute doesn't exist
Returns:
value of attribute if it exists, TRUE if the attribute exists but has no value, or defaultValue if tag lacks the attribute.

countHTMLAttributes

public int countHTMLAttributes()
Get number of HTML attributes on this tag.

Returns:
number of HTML attributes

getHTMLAttributes

public java.lang.String[] getHTMLAttributes()
Get all the HTML attributes found on this tag.

Returns:
array of name-value pairs, alternating between names and values. Thus array[0] is a name, array[1] is a value, array[2] is a name, etc.

enumerateHTMLAttributes

public java.util.Enumeration enumerateHTMLAttributes()
Enumerate the HTML attributes found on this tag.

Returns:
enumeration of the attribute names found on this tag.

removeHTMLAttribute

public Tag removeHTMLAttribute(java.lang.String name)
Copy this tag, removing an HTML attribute.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
Returns:
copy of this tag with named attribute removed. The copy is a region of a fresh page containing only the tag.

replaceHTMLAttribute

public Tag replaceHTMLAttribute(java.lang.String name)
Copy this tag, setting an HTML attribute's value to TRUE.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
Returns:
copy of this tag with named attribute set to TRUE. The copy is a region of a fresh page containing only the tag.

replaceHTMLAttribute

public Tag replaceHTMLAttribute(java.lang.String name,
                                java.lang.String value)
Copy this tag, setting an HTML attribute's value.

Parameters:
name - Name of HTML attribute (e.g. "HREF"). Doesn't have to be converted with toHTMLAttributeName().
value - New value for the attribute
Returns:
copy of this tag with named attribute set to value. The copy is a region of a fresh page containing only the tag.