NiagaraCQ: a scalable continuous query system for {Internet} databases

abstract

Continuous queries are persistent queries that allow users to receive new results when they become available. While continuous query systems can transform a passive web into an active environment, they need to be able to support millions of queries due to the scale of the Internet. No existing systems have achieved this level of scalability. NiagaraCQ addresses this problem by grouping continuous queries based on the observation that many web queries share similar structures. Grouped queries can share the common computation, tend to fit in memory and can reduce the I/O cost significantly. Furthermore, grouping on selection predicates can eliminate a large number of unnecessary query invocations. Our grouping technique is distinguished from previous group optimization approaches in the following ways. First, we use an incremental group optimization strategy with dynamic re-grouping. New queries are added to existing query groups, without having to regroup already installed queries. Second, we use a query-split scheme that requires minimal changes to a general-purpose query engine. Third, NiagaraCQ groups both change-based and timer-based queries in a uniform way. To insure that NiagaraCQ is scalable, we have also employed other techniques including incremental evaluation of continuous queries, use of both pull and push models for detecting heterogeneous data source changes, and memory caching. This paper presents the design of NiagaraCQ system and gives some experimental results on the system's performance and scalability.

top 10 topics

{ terms relevant query queries term effectiveness expansion keywords log refinement }
{ state critical degree exhibit continuous analytical absence spread finite scaling }
{ nodes wireless protocol routing protocols node sensor peertopeer scalable hoc }
{ groups group traditional explore benefits benefit addressed promise proposes extends }
{ memory parallel working capacity architectures hardware operations running implementations graphics }
{ approaches existing requires currently rapid advantages representative feasible guided explosion }
{ search internet strategies searching engine engines searches google millions locate }
{ region corresponding combines segmentation boundaries binary minimum prove spectral edge }
{ system performance file storage operating failure server files failures prototype }
{ behavior dynamic adaptive optimization artificial intelligence paradigm changing behaviors adapt }

top 10 topics after corrections

{ terms relevant query queries term effectiveness expansion keywords log refinement }
{ nodes wireless protocol routing protocols node sensor peertopeer scalable hoc }
{ system performance file storage operating failure server files failures prototype }
{ approaches existing requires currently rapid advantages representative feasible guided explosion }
{ effective designed amount efficiency reduce improving tracking rely handle filters }
{ groups group traditional explore benefits benefit addressed promise proposes extends }
{ algorithms clustering fast realworld speed faster approximation combinatorial algorithmic approximate }
{ review discuss challenges questions question answer focuses enhance focusing highlight }
{ need able increasing challenge require needed leading appropriate improvements date }
{ programming practical computation easy extension implement extensions implementing familiar computations }

top 25 related documents

Real life information retrieval: a study of user queries on the Web
Query clustering using user logs
Hourly analysis of a very large topically categorized web query log
Probabilistic query expansion using query logs
Continuously Adaptive Continuous Queries over Streams
Keyword Searching and Browsing in databases using BANKS
Improving Automatic Query Expansion
Patterns of Search: Analyzing and Modeling Web Query Refinement
{{C}oncept {B}ased {Q}uery {E}xpansion}
Using terminological feedback for web search refinement: a log-based study
A Day in the Life of PubMed: Analysis of a Typical Day’s Query Log
Personalized query expansion for the web
Query Recommendation using Query Logs in Search Engines
Defining a session on Web search engines
Extracting semantic relations from query logs
{Efficient Algorithms for Processing XPath Queries}
A review of ontology based query expansion
Analysis of a very large web search engine query log
Generating Query Substitutions
{SIP: Session Initiation Protocol}
Context-aware query classification
Mining term association patterns from search logs for effective query reformulation
Okapi at TREC
Selective Sampling Using the Query by Committee Algorithm
Robust classification of rare queries using web knowledge