Software Technology Track
of the
32nd Hawaii International Conference on System Sciences, HICSS-32
Maui, Hawaii - January 5-8, 1999

Distributed Caching and Replication Minitrack


Introduction

Minitrack description

Deadlines

Instructions for authors and referees

Minitrack coordinators


Introduction

In distributed systems such as information systems like the Word-Wide Web (WWW) but also distributed shared memory (DSM) systems, data is either locally available or must be hauled from remote severs, often over extremely long distances. It is the latter type of data retrieval which contributes to the shortage of precious resources such as communication bandwidth. Additionally, fetching data over long distances often results in unacceptable long response time. A key idea for solving the problem is to reduce the number of remote data retrieval operations or at least to reduce the number of data retrieval operations over excessive distances. This can be achieved in multiple ways. Two particularly interesting approaches are distributed caching and replication. Contrarily to other approaches, for instance mirrored data on so-called mirror-sites, distributed caching and replication, both, can be implemented transparently from the user's perspective. Thus, user behavior is not required to be changed when exploiting these two techniques. Distributed caching and replication can be briefly described as follows:

Distributed Caching

By locally storing (i.e. caching) frequently retrieved data at the point of usage, communication bandwidth can be saved and response time shortened since communicating with remote servers is not longer necessary: the local copy is returned to the requesting user. Depending on the user behavior or on the user application, different cache consistency strategies can be adopted leading to a more or less optimal cache exploitation. Distributed caching as described works "from the user's point of view."

Replication

In contrast, one can also cache data "from the data provider's point of view." Here, the provider or synonymously: the owner of the data caches copies at multiple places, preferably close to the intended users. Caching from the data provider's point of view is known as replication. If the user wants to retrieve data, in the optimal case, a close-by server with a local replica is contacted. This results in lower bandwidth consumption and shorter response time than in the non-replicated case. The management of replicas (i.e. cached copies) at the servers is solely controlled by the data provider. Since the provider can be assumed as being best informed about the nature of data he is maintaining, management control protocols such as allocation and consistency control schemes can be optimized on a per data item basis.

Combinations

Although the above discussion indicates that distributed caching and replication can in fact be used independently in an efficient manner, the question arises whether one can do better if the two techniques are combined. Clearly, one can identify scenarios in which the exclusive use of either distributed caching or replication solves a resource management problem but for a vast majority of scenarios, the combination of the two techniques appears to lead to a superior solution. Here, questions of how caching and replication can support each other, how they restrict the use of certain consistency and allocation control schemes, and how a particular combination performs under which conditions, are only a few important question which arise in this context.


Description of the Minitrack

The aim of the interdisciplinary minitrack "Distributed Caching and Replication" is to bring together researchers with database, operating systems, compiler, and hardware/software distributed shared memory system backgrounds currently working on bridging the gap between distributed caching and replication. Besides answers to the questions stated above, particular attention will be given to unifying frameworks, operating system support, compiler support and prototype implementations as well as performance evaluations of - in the above sense - "combined" approaches compared to approaches solely relying up on either distributed caching or replication. To summarize the discussion, the minitrack will call for papers dealing with distributed caching and replication focusing on the following topics:


Important Deadlines

A 300-word abstract by March 16
Feedback to author on abstract by April 15
Eight copies of the manuscript by June 1
Notification of accepted papers by August 31
Camera-ready copies of accepted manuscripts are due by October 1


Instructions for Authors

Submit a 300-word abstract to one of the minitrack coordinators according to by March 16, 1998. Feedback on the appropriateness of the abstract will be sent to you by April 15, 1998. Submit eight (8) copies of the full manuscript by June 1, 1998. Manuscripts should have an abstract and be 22-25 typewritten, double-spaced pages in length. Papers must not have been previously presented or published, nor currently submitted for journal publication. Each manuscript will be subjected to a rigorous refereeing process involving at least five reviewers. Individuals interested in refereeing papers should contact the minitrack coordinators directly.


Minitrack Coordinators


02/06/98 - Markus Pizka, pizka@informatik.tu-muenchen.de

Return to Software Technology Track