Reza Rejaie, Mark Handley, Haobo Yu, Deborah Estrin
University of Southern California
Information Sciences Institute
Marina Del Rey, CA 90292
reza, haoboy, estrin@isi.edu
AT&T Center for Internet Research at ICSI
Berkeley, CA 94704
mjh@aciri.org
Because of the need for congestion control in the Internet, multimedia streams should be quality adaptive. This implies that on a cache-hit, a proxy must replay a variable-quality cached stream whose quality is determined by the bandwidth of the first session.
This paper addresses the implications of congestion control and quality adaptation on proxy caching mechanisms for streaming applications. We present a fine-grain replacement algorithm for layered-encoded multimedia streams at Internet proxy servers, and describe a pre-fetching scheme to smooth out the variations in quality of a cached stream during subsequent playbacks. This enables the proxy to perform quality adaptation more effectively and maximizes the delivered quality. We also extend the semantics of popularity and introduce the idea of weighted hit to capture both the level of interest and the usefulness of a layer for a cached stream. Finally, we show that interaction between our replacement algorithm and pre-fetching results in the state of the cache converging to the optimal state such that the quality of a cached stream is proportional to its popularity, and the variations in quality of a cached stream are inversely proportional to its popularity. This implies that after serving several requests for a stream, the proxy can effectively hide low bandwidth paths to the original server from interested clients.
Keywords: Proxy Caching Mechanism, Congestion Control, Quality Adaptive Video Playback, Layered Transmission, Internet
During recent years, the rapid increase in commercial usage of the Internet has resulted in explosive growth in demand for web-based streaming applications [10,13]. This trend is expected to continue, and justifies the need for caching popular streams at a proxy server close to the interested clients. Proxy caching for multimedia streams also provides a good opportunity to support VCR-functionalities more interactively as the control latencies to the proxy are lower.
Despite the success of proxy caching in the Web, proxy servers have not been used effectively for caching of Internet multimedia streams such as audio and video. There may be several reasons for this:
The main challenge for proxy caching of Internet multimedia streams is the need for congestion control. In the Internet all end-systems are expected to perform congestion control to keep the network utilization high while limiting overload and improving inter-protocol fairness[6]. Since streaming media flows must co-exist with TCP-based flows such as HTTP, this congestion control will need to be TCP-friendly[14]. To prevent receiver buffer underflow when available bandwidth is limited and at the same time maximize the delivered quality to the end-user multimedia streaming applications should be quality adaptive - that is, they should change the compression ratio of the streamed data according to available network bandwidth. Thus proxy caches are likely to cache data that depends on the available bandwidth to the first client that retrieved a multimedia stream.
Once a stream is cached, the proxy can replay it from the cache for subsequent requests but it still needs to perform congestion control and quality adaptation during delivery. However, using the variable-quality cached stream to perform quality adaptation for subsequent requests could be problematic because there is no correlation between the variation in quality of the cached stream and the required quality for the new session.
We have been working on an end-to-end client/server architecture for playback of quality adaptive multimedia streams in a TCP-friendly fashion over the Internet[16]. There are many ways to adjust the quality but the one we are investigating is to use layered encoding. With a layered codec, the compressed data is spilt into a base layer which contains the most essential low quality information and other layers which provide optional enhancement information. Thus the more layers are played back, the better the quality becomes.
Layered organization of streams provides a perfect opportunity for a proxy cache to cope with variations in quality of a cached stream. If the cached stream is structured properly, the proxy is able to cope with variations in quality during subsequent playback by only playing a subset of layers from the cache, or by simultaneous playback from the cache and fetching additional layers from the original server.
The rest of this paper is organized as follows: First we present our end-to-end architecture and describe how proxy servers complement this architecture in section 2. The delivery procedures on a cache-miss and cache-hit are sketched out in section 3 where we describe a pre-fetching mechanism to cope with missing packets and variations in quality. We present details of the replacement algorithm in section 4. Section 5 addresses some of the related work on proxy caching for multimedia streams. Finally, section 6 concludes the paper and presents our future directions.
End-to-end congestion control is performed by the Rate Adaptation Protocol (RAP)[18] and acker modules at the server and client respectively. RAP is a rate-based congestion control mechanism that is suited for streaming applications. The RAP module continuously monitors the connection and regulates the server's transmission rate by controlling the inter-packet timing. The acker module acknowledges each packet, providing end-to-end feedback for monitoring the connection.
Error control is performed by the error control (EC) module at the server. It receives information about available bandwidth, loss rate and recent playout time from the RAP module. Based on this information, it either flushes packets from the server's buffer manager that were acknowledged or whose playout time has passed, or schedules retransmission of a lost packet if it has sufficiently high priority. The error control module can selectively retransmit those packets that have higher priority such as those from the base layer. Thus some losses are never retransmitted because they will miss their playout times, or they are low priority such as losses from the top layer.
The quality of the transmitted stream is adjusted by the Quality Adaptation (QA) module at the server[17]. This module periodically obtains information about the available bandwidth from the RAP module. Combining this information with the average retransmission rate provided by the error control module, the QA module adjusts the quality of the transmitted stream by adding or dropping layers accordingly. The client usually buffers some data by slightly delaying the playback of the first packet. Rate adaptation happens on a timescale of round-trip times but layers are added and dropped on a longer timescale. A combination of receiver buffering and quality adaptation is able to cope with variations of available bandwidth. Short term variations of transmission rate can be absorbed by buffering without adding or dropping layers whereas long term changes in available bandwidth trigger the quality adaptation mechanism to adjust the delivered quality of the stream by changing the number of transmitted layers.
The amount of available storage space of the proxy servers is proportional to the number of their clients and it must be generally sufficient to cache a few popular streams. All streams between the original source and the client or between the proxy server and the client must perform TCP-friendly congestion control and quality adaptation. This implies that not only the original server but also the proxy server must be able to support congestion control and quality adaptation. We do not make any assumption about the inter-cache architecture. Our work is orthogonal and applicable to different inter-cache architectures [2,22,26]. Note that the proposed proxy caching mechanism is not dependent on our end-to-end architecture. Thus details of congestion control or quality adaptation mechanisms do not affect our caching mechanism. In fact it can be adopted by any layered framework for unicast delivery of multimedia streams.
Our goals are to:
Each client directs its request to its configured proxy server. Upon arrival of a request, the proxy server looks up the cache for availability of the desired stream. The subsequent steps of the delivery procedure vary for a cache miss and a cache hit cases as we describe in the next two subsections.
The proxy can deploy two possible strategies to relay the stream:
The server performs congestion control and quality adaptation based on the state of the session between the server and the client. Thus the quality of the delivered stream is determined by the average available bandwidth during the session. If the server is located behind a low-bandwidth path across the Internet, the average quality of the delivered stream is low. Furthermore, there may still be packets missing from the delivered stream that have not been repaired during the session due to a failure to successfully retransmit them before their playout time at the client. Figure 3 shows an example portion of a layered stream that was cached at a proxy.
In summary, on a cache miss scenario, the client does not observe any benefit in terms of startup latency or improvement in quality, from presence of the proxy cache. Thus the quality of the session on a cache miss is the same as a direct playback from the server to the client.
A missing stream is always cached during its first playback. If cache space is exhausted, a sufficient number of segments from another cached stream are selected for replacement, and are flushed to make room for the new stream. Details of the replacement algorithm are explained in section 4.
Once the proxy has cached a stream for the first time, it has the
option to fill in the remaining packet losses in the cached
copy of the stream. It may wait until the next client requests
playback, or it may alternatively pro-actively pre-fetch the missing
packets from the server. If the proxy chooses to pre-fetch during an idle
time, it can simply use TCP for delivery as there are no timing constrains.
However if it fetches on demand, it will need to use RAP and combine this with
the pre-fetching mechanism described below. RAP is prefered to TCP in the
presence of timing constraints because TCP's reliability and in-order
delivery could result in a long delay while RAP only performs congestion
control.
Figure 4 depicts a portion of a cached stream after repairing
all the losses.
All the pre-fetched segments are cached even if they arrive later than their
playout time.
By pre-fetching the missing segments, the proxy smoothes out variations in
quality of the cached stream across the active layers.
The more a particular stream is played back from the cache, the smoother
its quality becomes across the active layers.
This decreases the probability of pre-fetching on subsequent cache hits
for sessions with the same or lower average bandwidth.
Figure 6 shows the quality of the cached stream after
the first playback.
The proxy needs to pre-fetch a higher layer as soon as the quality adaptation mechanism detects availability of higher bandwidth to improve the quality in the near future. This means that lower layers are played back from the cache while the higher layer is pre-fetched by the proxy ahead of time and is transmitted whenever the quality adaptation mechanism adds a new layer. Once a new layer is added, it is continuously pre-fetched until the available proxy-client bandwidth dictates the layer is dropped or the session ends. Figure 7 demonstrates pre-fetching of missing segments and new layers during a session. For example missing segments for L4 and L5 are pre-fetched during the interval [t1,t3] and [t2,t4]. Whereas L6 and L7 were pre-fetched as added layers. Pre-fetched layers are always cached. Figure 8 shows quality of the cached stream after a session with higher average bandwidth.
Although all the pre-fetched segments during a session may not be played back, they are cached in the hope that they will be used for the next request. This supports the conservative approach. To ensure in-time delivery of requested segments, the proxy should continuously monitor its round-trip-time to the active servers. This enables the proxy to issue a request ahead of time accordingly to absorb the delay in delivery. Pre-fetching is performed through a RAP connection because the strictly reliable stream-based nature of TCP may result in segments arriving after their playout point.
It is worth emphasizing that the more a particular cached stream is played back, the smoother its quality becomes across the active layers. Since the number of active layers is directly determined by the average bandwidth between the proxy and interested clients, this implies that the quality of the cached stream converges to the average quality across recent playback sessions.
Current replacement algorithms usually make a binary decision on the caching of an atomic object, i.e. the object is cached or flushed in its entirety based on the popularity of the object. As we mentioned earlier, layered multimedia streams are able to adjust their quality. This allows the replacement algorithm to make a multi-valued decision for caching of multimedia streams. As the popularity of a stream decreases, its quality and consequently its size is reduced before it is finally flushed from the cache. In other words, the popularity of a cached stream primarily affects its quality and then its status of residency in the cache.3 The coarse-grain adjustment in quality is achieved by dropping the highest layer, called a victim layer. However, to maximize the efficiency of the cache and avoid fragmentation of the cache space, the victim layer is dropped with the granularity of a segment. To hide the startup latency, it is prefered to keep the initial segments of a layer in the cache. Furthermore, to minimize the variation in quality it is prefered to keep contiguous segments of a layer. This leads us to the replacement pattern in figure 9. Once the highest layer is selected as a victim, its cached segments are flushed from the end toward the beginning in a demand-driven fashion.
We use a modified hit ratio of a cached stream as a metric to measure its popularity. The proxy can easily count number of cache hits for every cache resident stream during an interval, called popularity window. However we realize that the current definition of a hit is inadequate to effectively capture the subtleties of level of interest in a particular stream. This is mainly due to the time scale for delivery of multimedia streams being several orders of magnitude longer than for web objects. Most of the current schemes assign a binary value to a hit, i.e. 0 for lack of interest and 1 for each request. This model perfectly suits the atomic nature of interest in web objects, i.e. the client is either interested in the entire object or is not interested at all. In the context of streaming applications, the client can interact with the server and perform VCR-functionalities (i.e. Stop, Fast forward, Rewind, Play). Intuitively, the popularity of each stream must reflect the level of interest that is observed through this interaction. We assume that the total duration of playback for each stream indicates the level of interest in that stream. For example if a client only watches half of one stream, their level of interest is half of a client who watches the entire display. This approach can also include weighted duration of fast forward or rewind with proper weighting.
Based on this observation we extend the semantic of a hit and introduce
a weighted hit, called a whit, which is defined as follows
4:
,
![]()
where
PlaybackTime and
StreamLength denote total playback time of
a session and length of the entire stream respectively.
Both
PlaybackTime and
StreamLength have dimension of time (i.e.
measured in second). The proxy server keeps
track of weighted hits on a per-layer basis. For each layer of a session,
the total playback time for each layer is recorded and used to
calculate the whit for that layer at the end of the session. The
cumulative value of whit during a recent window is used as a
popularity index of a layer of a cached stream. The popularity of each
layer is recalculated at the end of a session as follows:
![]()
where P and
denote popularity and the width of the popularity
window respectively.
Notice that the behavior of the quality adaptation mechanism results in different PlaybackTime for different layer in a session and consequently affects the value of a cached layer. Since the available bandwidth directly controls the number of active layers, the longer a layer is played back for interested clients during recent sessions, the higher is the probability of using that layer in future sessions. For example, if the majority of clients who are interested in streaming Titanic have low bandwidth, the quality adaptation mechanism can only play back say 3 layers. Thus higher layers should not be cached because there is a low probability of using those layers. This implies that the total playback duration of a layer indicates its value. To incorporate this parameter, the server must separately calculate the popularity for each layer of a cached stream at the end of a session. Applying the definition of popularity on a per-layer basis is in fact compatible with our proposed replacement pattern because layered decoding guarantees that popularity of different layers of each stream monotonically decreases with the layer number5.
The proxy maintains a popularity table such as Table 1. Each table entry consists of stream name, layer number, popularity of the layer and a locking flag. Once the cache space is exhausted, the proxy flushes the last segments of the least popular layer (e.g. L1of ``Amistad'') until sufficient space becomes available. Because of the decoding requirement, the least popular layer is always the highest layer of a stream. Popularity of this layer could be low due to lack of interest among clients, or lack of sufficient bandwidth to play this layer for interested clients, or both6. Note that with the exception of the base layer of each stream, all segments of other layers can be flushed out if they are the least popular layer. The first few segments of the base layer for each stream are kept in the cache as long as its popularity is beyond a threshold to hide the startup latency of possible future requests.
It is worth noting that both the replacement pattern and popularity function are directly determined by the expected functionality from the proxy. For example if the main functionality of the proxy is to hide the startup latency, the proxy should adopt a different replacement pattern to maintain initial segments of all layers. The new replacement pattern flushes ending segments of lower layers before initial segments of higher layers. Another example is a proxy that is expected to cache the most popular portion of different movies. Then the proxy should keep track of per-segment or per-chunk popularity and replace the segments based on their popularity.
Video streams exhibit burstiness due to encoding algorithm and variations within and between frames. The variation poses a problem to both bandwidth requirement and playback buffer management. In order to smooth the burstiness during transmission, a number of schemes have been proposed. One scheme [23] pre-fetches and stores portions of a stream in proxy servers, and later uses them to smooth the stream during playback. Another scheme [20] stores prefixes of multimedia streams in proxy caches. It is able to improve startup latency in addition to performing smoothing.
Our work is complementary to the works on smoothing. They do not address congestion control of multimedia streams in the Internet. We focus on the design of a proxy caching mechanism which is aware of congestion control mechanisms used in the transmission of multimedia streams. Cached streams with our mechanism can be also used to perform smoothing to facilitate client-side playback buffer management.
There are numerous works on proxy cache replacement algorithms [1,11,15,19,25,24]. They evaluate their algorithms using existing web proxy traces. However, the behavior of these algorithms for an access pattern with a significant number of requests to huge multimedia streams has not been studied. To our knowledge, there is only one work which addresses the influence of multimedia streams on cache replacement algorithms [21]. They consider the impact of resource requirements (i.e. bandwidth and space) on cache replacement algorithms. Our work complements this effort in that we provide a more accurate estimation of bandwidth and size requirements through the exposure of congestion control mechanisms. Thus their algorithms may easily be applied to our architecture. It remains as future work to examine all these replacement algorithms with our proposed scheme.
We plan to continue our work in several phases. For the first phase, we start with a simulation-based study of the caching mechanism to investigate the impact of different parameters on its performance. More specifically, we are interested in the effect of access patterns, the bandwidth distribution among clients, segment size, and of cached stream length distributions. Furthermore, we plan to study the overhead of conservative pre-fetching as well as other popularity functions. We plan to explore alternative replacement patterns and popularity functions. In the second phase, we plan to validate our simulation results by using a real world access pattern from actual traces collected at popular servers in the Internet.
Demand-driven pre-fetching is another interesting area for us to explore where clients specify their interests to a particular stream and its desired quality ahead of time. Providing VCR-functionality by the proxy and the implication of this on caching requires further investigation as well. We also plan to study potential effects of inter-cache architecture on the proxy caching mechanism.
This document was generated using the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998)
Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 -antialias -antialias_text mc.
The translation was initiated on 1999-03-24