|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.cs.readweb.util.Crawler
Field Summary | |
static java.lang.String |
DISALLOW
|
static int |
pageAddesToSearch
|
static int |
pageStreamCalled
|
static java.util.Vector |
vectorToSearch
|
Constructor Summary | |
Crawler()
to collect web pages from the World Wide Web. |
Method Summary | |
static java.lang.String |
convertLegalUrlName(java.lang.String uName)
We stroe html file with its url name. |
static void |
CrawlDomain(java.lang.String startURL,
java.lang.String startDomain,
java.lang.String cacheDir,
int SEARCH_LIMIT)
startURL: a start URL to crawl web pages startDomain: a URL to constrain the domain to crawl web pages cacheDir: directory name where to cache crawled web pages SEARCH_LIMIT: the limitation number of maximum pages to crawl Example: |
static java.lang.String |
CrawlPage(java.lang.String inputURL,
java.lang.String cacheDir)
crawl a page by given url. |
static java.lang.String |
getPageStream(java.lang.String inputUrl)
get web page content from a given url |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static java.util.Vector vectorToSearch
public static int pageStreamCalled
public static int pageAddesToSearch
public static final java.lang.String DISALLOW
Constructor Detail |
public Crawler()
Method Detail |
public static java.lang.String CrawlPage(java.lang.String inputURL, java.lang.String cacheDir)
public static void CrawlDomain(java.lang.String startURL, java.lang.String startDomain, java.lang.String cacheDir, int SEARCH_LIMIT)
public static java.lang.String convertLegalUrlName(java.lang.String uName)
public static java.lang.String getPageStream(java.lang.String inputUrl) throws java.net.MalformedURLException
java.net.MalformedURLException
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |