websphinx
Class Region

java.lang.Object
  |
  +--websphinx.Region
Direct Known Subclasses:
Element, Page, SearchEngineResult, Tag, Text

public class Region
extends java.lang.Object

Region of an HTML page.


Field Summary
protected  int end
           
protected  java.util.Hashtable names
           
protected  Page source
           
protected  int start
           
static java.lang.String TRUE
          Default value for labels set with setLabel (name).
 
Constructor Summary
Region(Page page, int start, int end)
          Makes a Region.
Region(Region region)
          Makes a Region by copying another region's parameters.
 
Method Summary
 java.util.Enumeration enumerateObjectLabels()
          Enumerate the labels of the region.
static int findEnd(Region[] regions, int p)
          Finds a region that ends at or after a given position.
static int findStart(Region[] regions, int p)
          Finds a region that starts at or after a given position.
 int getEnd()
          Gets offset after end of region.
 Region getField(java.lang.String name)
          Get a named subregion.
 Region[] getFields(java.lang.String name)
          Get a set of named subregions.
 java.lang.String getLabel(java.lang.String name)
          Get a label's value.
 java.lang.String getLabel(java.lang.String name, java.lang.String defaultValue)
          Get a label's value.
 int getLength()
          Gets length of the region.
 java.lang.Number getNumericLabel(java.lang.String name, java.lang.Number defaultValue)
          Get a label's value as a number.
 java.lang.Object getObjectLabel(java.lang.String name)
          Get an object-valued label.
 java.lang.String getObjectLabels()
          Get a String containing the labels of the region.
 Element getRootElement()
          Get the root HTML element of the region.
 Page getSource()
          Gets page containing the region.
 int getStart()
          Gets starting offset of region in page content.
 boolean hasAllLabels(java.lang.String expr)
          Test if all of several labels are set.
 boolean hasAllLabels(java.lang.String[] labels)
          Test if all of several labels are set.
 boolean hasAnyLabels(java.lang.String expr)
          Test if one or more of several labels are set.
 boolean hasAnyLabels(java.lang.String[] labels)
          Test if one or more of several labels are set.
 boolean hasLabel(java.lang.String name)
          Test if a label is set.
 void removeLabel(java.lang.String name)
          Remove a label.
 void setField(java.lang.String name, Region region)
          Name a subregion (by setting a label to point to it).
 void setFields(java.lang.String name, Region[] regions)
          Name a set of subregions (by pointing a label to them).
 void setLabel(java.lang.String name)
          Set a label on the region.
 void setLabel(java.lang.String name, java.lang.String value)
          Set a string-valued label.
 void setObjectLabel(java.lang.String name, java.lang.Object value)
          Set an object-valued label.
 Region span(Region r)
          Makes a new Region containing two regions.
 java.lang.String toHTML()
          Converts the region to HTML, e.g.
 java.lang.String toString()
          Gets region as raw content.
 java.lang.String toTags()
          Converts the region to HTML tags with no text, e.g.
 java.lang.String toText()
          Converts the region to tagless text, e.g.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

source

protected Page source

start

protected int start

end

protected int end

names

protected java.util.Hashtable names

TRUE

public static final java.lang.String TRUE
Default value for labels set with setLabel (name). Value of TRUE is "true".

Constructor Detail

Region

public Region(Page page,
              int start,
              int end)
Makes a Region.

Parameters:
page - Page containing region
start - Starting offset of region in page content
end - Ending offset of region in page

Region

public Region(Region region)
Makes a Region by copying another region's parameters.

Parameters:
region - Region to copy
Method Detail

getSource

public Page getSource()
Gets page containing the region.

Returns:
page containing the region

getStart

public int getStart()
Gets starting offset of region in page content.

Returns:
zero-based offset where region begins in page content

getEnd

public int getEnd()
Gets offset after end of region.

Returns:
zero-based offset just after the end of the region.

getLength

public int getLength()
Gets length of the region. Equivalent to getEnd() - getStart().

Returns:
length of the HTML region in bytes.

toHTML

public java.lang.String toHTML()
Converts the region to HTML, e.g. "<tag><tag><tag>text text</tag>" If the region does not contain HTML, then this function quotes all the <, >, & characters found in the page content, and wraps the result in
 and 
.

Returns:
a string consisting of the HTML content contained by this region.

toText

public java.lang.String toText()
Converts the region to tagless text, e.g. "text text".

Returns:
a string consisting of the text in the page contained by this region

toTags

public java.lang.String toTags()
Converts the region to HTML tags with no text, e.g. "<tag><tag></tag>".

Returns:
a string consisting of the tags in the page contained by this region

toString

public java.lang.String toString()
Gets region as raw content.

Overrides:
toString in class java.lang.Object
Returns:
string representation of the region

getRootElement

public Element getRootElement()
Get the root HTML element of the region.

Returns:
first HTML element whose start tag is completely in the region.

findStart

public static int findStart(Region[] regions,
                            int p)
Finds a region that starts at or after a given position.

Parameters:
regions - array of regions sorted by starting offset
p - Desired starting offset
Returns:
index k into regions such that:
  1. forall j<k: regions[j].start < p
  2. regions[k].start >= p

findEnd

public static int findEnd(Region[] regions,
                          int p)
Finds a region that ends at or after a given position.

Parameters:
regions - array of regions sorted by ending offset
p - Desired ending offset
Returns:
index k into regions such that:
  1. forall j<k: regions[j].end < p
  2. regions[k].end >= p

span

public Region span(Region r)
Makes a new Region containing two regions.

Parameters:
r - end of spanning region
Returns:
region from the beginning of this region to the end of r. Both regions must have the same source, and r must end after this region starts.

setObjectLabel

public void setObjectLabel(java.lang.String name,
                           java.lang.Object value)
Set an object-valued label.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
value - value set for label. If null, the label is removed.

getObjectLabel

public java.lang.Object getObjectLabel(java.lang.String name)
Get an object-valued label.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
Returns:
Object value set for label, or null if label not set

enumerateObjectLabels

public java.util.Enumeration enumerateObjectLabels()
Enumerate the labels of the region.

Returns:
enumeration producing label names

getObjectLabels

public java.lang.String getObjectLabels()
Get a String containing the labels of the region.

Returns:
string containing the label names, separated by spaces

setLabel

public void setLabel(java.lang.String name,
                     java.lang.String value)
Set a string-valued label.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
value - value set for label. If null, the label is removed.

setLabel

public void setLabel(java.lang.String name)
Set a label on the region. The value of the label defaults to TRUE.

Parameters:
name - name of label (case-sensitive, whitespace permitted)

getLabel

public java.lang.String getLabel(java.lang.String name)
Get a label's value.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
Returns:
value of label, or null if label not set

getLabel

public java.lang.String getLabel(java.lang.String name,
                                 java.lang.String defaultValue)
Get a label's value. If the label is not set, return defaultValue.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
defaultValue - default value that should be returned if label is not set
Returns:
value of label, or defaultValue if not set

getNumericLabel

public java.lang.Number getNumericLabel(java.lang.String name,
                                        java.lang.Number defaultValue)
Get a label's value as a number. Returns the first number (integral or floating point) that can be parsed from the label's value, skipping an arbitrary amount of junk.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
defaultValue - default value that should be returned if label is not set
Returns:
numeric value of label, or defaultValue if not set or no number is found

hasLabel

public boolean hasLabel(java.lang.String name)
Test if a label is set.

Parameters:
name - name of label (case-sensitive, whitespace permitted)
Returns:
true if label is set, otherwise false

hasAnyLabels

public boolean hasAnyLabels(java.lang.String expr)
Test if one or more of several labels are set.

Parameters:
expr - a list of label names separated by spaces
Returns:
true if region has at least one of the labels in expr

hasAnyLabels

public boolean hasAnyLabels(java.lang.String[] labels)
Test if one or more of several labels are set.

Parameters:
labels - an array of label names
Returns:
true if region has at least one of the labels

hasAllLabels

public boolean hasAllLabels(java.lang.String expr)
Test if all of several labels are set.

Parameters:
expr - a list of label names separated by spaces
Returns:
true if region has at least one of the labels in expr

hasAllLabels

public boolean hasAllLabels(java.lang.String[] labels)
Test if all of several labels are set.

Parameters:
labels - an array of label names
Returns:
true if region has all of the labels

removeLabel

public void removeLabel(java.lang.String name)
Remove a label.

Parameters:
name - name of label (case-sensitive, whitespace permitted)

setField

public void setField(java.lang.String name,
                     Region region)
Name a subregion (by setting a label to point to it).

Parameters:
name - label name (case-sensitive, whitespace permitted)
region - subregion to name

getField

public Region getField(java.lang.String name)
Get a named subregion.

Parameters:
name - label name (case-sensitive, whitespace permitted)
Returns:
the named region, or null if label not set to a region

setFields

public void setFields(java.lang.String name,
                      Region[] regions)
Name a set of subregions (by pointing a label to them).

Parameters:
name - label name (case-sensitive, whitespace permitted)
regions - list of subregions

getFields

public Region[] getFields(java.lang.String name)
Get a set of named subregions. Note that subregions named with setField() cannot be retrieved with getFields(); use getField() instead.

Parameters:
name - label name (case-sensitive, whitespace permitted)
Returns:
the named subregions, or null if label not set to a set of subregions