Abstract/Details

Unifying databases and Internet-scale publish /subscribe


2008 2008

Other formats: Order a copy

Abstract (summary)

With the advent of Web 2.0 and the Digital Age, we are witnessing an unprecedented increase in the amount of information collected, and in the number of users interested in different types of information. This growth means that traditional techniques, where users poll data sources for information of interest, are no longer sufficient. Polling too frequently does not scale, while polling less often may result in users missing important updates. The alternative push technology has long been the goal of publish/subscribe systems, which proactively push updates (events) to users with matching interests (expressed as subscriptions). The push model is better suited for ensuring scalability and timely delivery of updates, important in many application domains: personal (e.g., RSS feeds, online auctions), financial (e.g., portfolio monitoring), security (e.g., reporting network anomalies), etc.

Early publish/subscribe systems were based on predefined subjects ( channels), and were too coarse-grained to meet the specific interests of different subscribers. The second generation of content-based publish/subscribe systems offer greater flexibility by supporting subscriptions defined as predicates over message contents. However, subscriptions are still stateless filters over individual messages, so they cannot express queries across different messages or over the event history. The few systems that support more powerful database-style subscriptions do not address the problem of efficiently delivering updates to a large number of subscribers over a wide-area network. Thus, there is a need to develop next-generation publish/subscribe systems that unify the support for richer database-style subscription queries and flexible wide-area notification. This support needs to be complemented with robust processing and dissemination techniques that scale to high event rates and large databases, as well as to a large number of subscribers over the Internet.

The main contribution of our work is a collection of techniques to support efficient and scalable event processing and notification dissemination for an Internet-scale publish/subscribe system with a rich subscription model. We investigate the interface between event processing by a database server and notification delivery by a dissemination network. Previous research in publish/subscribe has largely been compartmentalized; database-centric and network-centric approaches each have their own limitations, and simply putting them together does not lead to an efficient solution. A closer examination of database/network interfaces yields a spectrum of new and interesting possibilities. In particular, we propose message and subscription reformulation as general techniques to support stateful subscriptions over existing content-driven networks, by converting them into equivalent but stateless forms. We show how reformulation can successfully be applied to various stateful subscriptions including range-aggregation, select-joins, and subscriptions with value-based notification conditions. These techniques often provide orders-of-magnitude improvement over simpler techniques adopted by current systems, and are shown to scale to millions of subscriptions. Further, the use of a standard off-the-shelf content-driven dissemination interface allows these techniques to be easily deployed, managed, and maintained in a large-scale system.

Based on our findings, we have built a high-performance publish/subscribe system named ProSem (to signify the inseparability of database processing and network dissemination). ProSem uses our novel techniques for group-processing many types of complex and expressive subscriptions, with a per-event optimization framework that chooses the best processing and dissemination strategy at runtime based on online statistics and system objectives.

Indexing (details)


Subject
Computer science
Classification
0984: Computer science
Identifier / keyword
Applied sciences; Databases; Dissemination; Internet; Networks; Processing; Publish/subscribe; Publishing; Subscribing
Title
Unifying databases and Internet-scale publish /subscribe
Author
Chandramouli, Badrish
Number of pages
229
Publication year
2008
Degree date
2008
School code
0066
Source
DAI-B 69/07, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780549655824
Advisor
Yang, Jun
Committee member
Babu, Shivnath; Chase, Jeff; Ellis, Carla; Lei, Hui
University/institution
Duke University
Department
Computer Science
University location
United States -- North Carolina
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3315355
ProQuest document ID
304637737
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304637737
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.