Adam Dingle (dingle@ksvi.mff.cuni.cz)
Kabinet Software a Vyuky Informatiky (KSVI)
Charles University, Prague, Czech Republic
While HTTP 1.0 proxy caches typically maintain cache consistency in an ad hoc fashion, the HTTP 1.1 proposed standard explicitly defines cache consistency and describes an expiration/validation model which caches may use. In this paper, we carefully examine and criticize HTTP 1.1's consistency model. We discuss various issues involved in implementing HTTP 1.1 cache consistency, and make several recommendations to implementors. We point out several areas where the HTTP 1.1 proposed specification has problems or could be clarified. Finally, we present a Java implementation of an HTTP 1.1 proxy cache which may be used as a model for cache implementors.
Keywords: HTTP, cache, consistency, coherence
Any cache which provides non-exclusive access to a set of mutable data must have some mechanism for keeping the cache's contents up to date with the data itself; such caches include memory caches in shared-memory multiprocessors, caches for distributed file systems, caches for the Internet Domain Name System (DNS), and, more recently, caches for the World Wide Web. The problem of keeping cache contents up to date is usually called consistency, and we will use this term in this paper; some literature about distributed file systems uses the alternate term coherence. Mechanisms used for Web cache consistency may differ substantially from those used for multiprocessors or file systems, for two reasons.[Din96] First, the Internet is more prone to connectivity failures than smaller, more tightly-coupled systems. Second, Web pages have value to users even if they are slightly out of date, which means that Web caches do not need to be perfectly consistent with home servers.
The HTTP 1.0 specification defines several headers related to cache consistency, including If-Modified-Since and Expires. A basic understanding of how caches are expected to maintain consistency can be inferred from the HTTP 1.0 specification, but the specification gives no explicit definition of cache consistency and leaves many issues open to the implementor. The early CERN httpd cache implementation uses the If-Modified-Since and Expires headers; as it was written by the same group that designed the HTTP protocol, its source code may be viewed as a precise definition of how HTTP 1.0 caches are expected to maintain consistency. Yet the implementation suffers from errors and hence fails to provide strong guarantees about the age of pages that caches may return [Din96, section 2.5.3].
The new HTTP 1.1 proposed standard (further just "HTTP 1.1" or "the HTTP 1.1 specification"), in contrast, pays a great deal of attention to caching and attempts to define cache consistency explicitly. It provides a number of new headers which allow clients and servers to specify the degree of consistency they require, and which allow warnings to be sent when consistency requirements are not met. Allowing clients to specify how much consistency they need is a particularly important step forward because it allows users to control the tradeoff between the staleness of returned pages and the delay of waiting for pages; this is especially crucial for users with low-bandwidth connections, such as users who dial into the Internet with modems or users in countries with poor Internet connectivity (such as the author). HTTP 1.1 also allows pages to be validated using explicit validators rather than last-modification dates, and provides a mechanism to compensate for clock skew. Furthermore, it specifies the interaction between caches and new HTTP 1.1 features such as byte ranges and content negotiation; these new features are beyond the scope of the present paper.
Despite the important steps forward in the HTTP 1.1 proposal, this paper will show that HTTP 1.1 does not fully and clearly define its cache consistency model, especially with regard to If-Modified-Since requests. Furthermore, the proposal does not address a number of important issues which will be critical to cache implementors. This paper complements the HTTP 1.1 proposal by carefully examining the motivation for HTTP 1.1's new cache consistency features and the ambiguities in the HTTP 1.1 specification, as well as several issues which implementors must address. The paper also presents a Java implementation of an HTTP 1.1 caching proxy; this implementation shows precisely how a cache might work, and could be used as a starting point for HTTP 1.1 cache implementors.
As mentioned above, W3C httpd was an early HTTP 1.0 proxy cache implementation and uses the Expires and If-Modified-Since headers to maintain cache consistency; Netscape Proxy Server apparently uses a similar consistency mechanism. A flurry of HTTP 1.0 proxy server implementations has followed. The popular Harvest 1.x cache never generates If-Modified-Since requests; it throws out pages when they expire and completely reloads them on the next access, even if they have not changed. Its commercial successor, Harvest 2.0, has been improved to generate If-Modified-Since requests. Its non-commercial successor, Squid, has just very recently been updated to use If-Modified-Since (the change appeared in version squid-1.1.alpha17); the long wait for its implementation perhaps occurred because its implementors were not completely sure about how to deal with If-Modified-Since in a cache hierarchy, an area still not addressed clearly enough by the HTTP 1.1 proposal. Squid is an important cache implementation because a large international cache hierarchy has evolved in the last year, mostly using Squid. It is the author's sincere hope that Squid will be updated soon to handle full HTTP 1.1 cache consistency and that this paper will be helpful to the Squid cache implementors in this respect.
In this section we review HTTP 1.1's basic cache consistency model; see the specification for details.
The HTTP GET method is used to retrieve documents directly from a Web server or indirectly through a proxy cache. A Web server responds to a GET request by sending a document with a set of headers containing document metainformation. The header most fundamental to cache consistency is the Date header; it is mandatory and contains the time when the response was generated.
A caching proxy server responds to a GET request for a page P by first checking whether a copy of P is in the cache. If no copy of P is present, the proxy forwards the request to the top-level server or to a higher-level proxy server. When it receives a response, it stores the response, including the Date header and most other headers (excluding so-called hop-by-hop headers; see section 13.5 of the HTTP 1.1 proposal) in the cache and returns the response to the caller (the user or a lower-level proxy). The proxy leaves the Date header in the response untouched; thus when a user sends a GET request through a set of proxies, the Date header in the response contains the time when the top-level server generated the response, measured using the clock at the top-level server. HTTP 1.1 contains a mechanism to compensate for clock skew; see section 13.2.3 of the HTTP 1.1 proposal. This paper will not address HTTP's clock skew mechanism, and from now on in this paper we will assume that all clocks are synchronized.
If a proxy server receives a GET request for a page and finds a copy of the page in its cache, it must decide whether to return the cached page or to update the cached page. HTTP 1.1 proxy servers make this decision based on the age of the cached page, which is defined as the amount of time which has elapsed since the page was generated, and can be computed as the current time minus the value of the page's Date header. Upon receiving a request for a page, a cache computes the page's maximum age by combining requirements expressed by the server and the client; the exact mechanism for computing the maximum age is discussed in the following section. If a cached page's age is less than the its maximum age, a cache can return the page without updating it; otherwise, a cache must update the page. A cache updates a cached page by using a GET request to reload it, passing along any client requirements specified in the original request. Some caches use the If-Modified-Since mechanism discussed in a following section to reload pages only if their contents have actually changed. Updating a cached page always results in a page whose age is less than the maximum age computed for the original request, and which can be returned in response to the original request (well, almost always; see the section "What to do when a request can't be satisfied" below).
In the simplest case, the server alone determines a page's maximum age, and does so by returning a Cache-Control: max-age header with the page; the age value in this header is measured in seconds. HTTP 1.1 encourages servers to return this header with every page. If a cached page's age is less than the value of Cache-Control: max-age, the page is said to be fresh; otherwise, the page is stale. The page is said to expire at the moment it becomes stale; this moment is called the page's expiration time. If the client specifies no page age requirements, then a cache is required to update a requested page if the page is stale; if the client specifies its own requirements, the cache may be required to update a page even if it is not stale, or may be absolved from the responsibility of updating a stale page.
A server can also specify a maximum age using the Expires: header, which specifies an absolute time at which the page will expire rather than a time relative to the moment at which the page was returned from the server. The semantic differences between Cache-Control: max-age and Expires: are unimportant and are discussed in section 14.9.3 of the HTTP 1.1 proposal.
If a server provides neither Cache-Control: max-age nor Expires: for a given page, a proxy cache generally assigns the page a heuristic maximum age. If the server has provided a Last-Modified header for the page, the heuristic maximum age is typically a fraction of the difference between the page's Date and Last-Modified times; this is implied, but not stated clearly enough, in section 13.2.4 of the HTTP 1.1 specification. The moment at which a page's age exceeds its heuristic maximum age is called the page's heuristic expiration time. The HTTP 1.1 specification discourages the use of heuristic maximum ages/expiration times, and indeed as we will see such heuristics can cause problems in a cache hierarchy if different caches use different heuristics. Several years ago, experiments showed [Gla94] that few servers provide expiration times with pages, and unfortunately the situation is probably not much different today. There are probably two reasons why servers don't usually provide expiration times. The first is the common misconception that expiration times are appropriate only for pages which will definitely change at a given time, such as timetables and news pages. The second reason is that most HTTP servers aren't capable of generating expiration times in a way that is convenient for the server administrator. There is no reason why HTTP servers couldn't compute expiration times heuristically using a page's Last-Modifed date, just as caches do today; doing so would eliminate the aforementioned cache hierarchy problems, and would leave server administrators with the flexibility to modify parameters of the expiration time heuristic, as well as the ability to explicitly assign expiration times to pages when desired.
A user can supply the Cache-Control: max-age header on a request to specify their own requirement for the maximum age of a returned response; if both the user and the server specify maximum ages for a response, the minimum of the two max-age values is used to determine whether a cached page must be updated. A user who wants to relax the server's max-age requirement can send the Cache-Control: max-stale header, in which case a cached page can be returned to the user without being updated even if it has expired, so long as the time since the page's expiration is not greater than the number of seconds specified in the max-stale header. If in a GET request the user specifies max-stale: with no value, caches will ignore the server's max-age requirement in determining that request's maximum age; from now on in this paper, we will denote this case using the notation "max-stale: (infinity)". The HTTP 1.1 specification is vague about the case in which the user specifies max-age or max-stale but the server provides neither max-age nor Expires. In this case, user requirements could possible be combined with the cache's heuristic expiration time for the page; alternatively, the heuristic expiration time could be ignored.
Note that a page's maximum age is computed by combining information from the server and client, but a page's expiration time is computed using server information alone (or possibly using cache heuristics if the server says nothing about expiration). Thus, the notion of whether a page is stale or fresh at a given time is independent of any single user request for that page; this fact is not stated clearly in the HTTP 1.1 specification, and should be. The notion of maximum age, on the other hand, must be considered relative to each user request. The term freshness lifetime, frequently used in the HTTP 1.1 specification, is the amount of time which remains before a page expires; it is not the same as a page's maximum age.
Note also that the term "maximum age" as used here is essentially my own. The HTTP 1.1 specification refers to "the least [sic] restrictive freshness requirement of the client, server, and cache", a related concept, in section 13.1.1 "Cache Correctness". First, this is certainly an error: the term "least" should be "most". The error is evidenced by the fact that the complementary case, in which a warning must be issued, occurs when either the client's or the server's freshness demand is violated. Even when we correct the error, "maximum age" and "most restrictive freshness requirement" are not the same: when the client specifies max-stale in a request, the page's maximum age for the request will be greater than the most restrictive freshness requirement, which will be the page's freshness lifetime as determined by the server. To summarize: a requested page's maximum age determines the age at which a cache will update a cached page, and can be relaxed by the client; the most restrictive freshness requirement for a request determines the age at which a cache must attach a warning to a response, and cannot be relaxed by the client.
If a page's maximum age is too high, caches will perform consistency checks too infrequently and users will frequently receive page versions which are out of date. On the other hand, if a page's maximum age is too low, caches will perform consistency checks too frequently; then network bandwidth will be wasted performing needless consistency checks, and users may be subject to longer delays in waiting for page to load. The optimal maximum age for a page, then, depends on several factors:
Several of these factors - the frequency and importance of page modifications - are known by the server. The cost and quality of network bandwidth are primarily properties of the client: a client who is poorly connected to the Internet may want to update pages infrequently to save precious network bandwidth, and thus might want to assign higher maximum age values. The cost and quailty of network bandwidth near the server are typically less important, because servers tend to be better connected than clients, who may be behind modems or in poorer countries which typically import more information over the Internet than they export.
As servers will typically provide a maximum age for each page, it might be useful if a client could request pages whose maximum age were some constant factor (perhaps provided by a "Cache-Control: exp-factor" header) times the Cache-Control: max-age value provided by the server for the page. Then a poorly-connected client could request pages whose age was up to twice the max-age value specified by the server; a large cache on an Internet backbone which wanted to keep its pages extremely up to date could reqire that pages' ages be no more than half the max-age value specified by the server. HTTP 1.1 does indeed allow clients' and servers' freshness requirements to be combined in an additive way using the min-fresh and max-stale directives, but we feel that combining requirements multiplicatively would be more robust in the face of pages with widely varying expiration times.
As many clients will request pages without specifying any client freshness requirements, an HTTP server administrator must set page expiration times to be in some way reasonable for clients all over the Internet; this is problematic because it assumes that the cost and quality of network bandwidth is roughly constant across the Internet, which may not be true at all, and because it places the burden of knowing something about the entire Internet on the server administrator. If freshness requirements could be combined multiplicatively and HTTP clients commonly provided an "exp-factor" reflecting the quality of their Internet connectivity, HTTP servers could assign "relative" max-age values to pages on the basis of properties of those pages without taking the Internet as a whole into consideration. For example, the HTTP specification could suggest that servers set the maximum age of a page to be the amount of time after which the probability that the page has changed is expected to be, say, 10%; servers which set maximum page ages heuristically as a constant fraction of the time since the page has modified could set the heuristic constant to be the same 10%. Then HTTP clients could use an "exp-factor" of 2 if they are willing to wait until there is an approximately 20% percent chance that a page has changed before updating it.
Despite the problems we've mentioned, absolute maximum ages alllow server administrators to determine just how old the pages most users view might be, and are more appropriate than the "relative" maximum ages described above when pages are modified according to a regular schedule. For these reasons, it is likely that the current and future versions of HTTP will continue to suggest that servers provide max-age values which are interpreted absolutely by most clients.
HTTP's consistency mechanism allows caches in a hierarchy to meet clients' maximum age requirements while using a bounded amount of network traffic. To be more precise, consider a hierarchy of caches with a top-level cache T, which is the only cache which makes requests to non-cache HTTP servers. Suppose that a set of clients access page P through the hierarchy, and that P's maximum age for all requests is some constant A. (This maximum age A is most likely specified by the page's server using a max-age header, but could alternatively be specified by the requesting clients). Then on average, each client will update P at most once per A units of time. To see this, first consider the top-level cache T. Whenever T updates P, T receives a copy of P which is completely up to date. This copy can be used to service requests (including update requests) without consulting P's server for the next A units of time. This shows that T will update P at most once per A units of time. It also shows that every time T updates its cached copy of P, it receives a copy of P whose Date is at least A greater than the value of Date in the cached copy it is updating. Next, consider any other cache C in the hierarchy. Every time C updates its cached copy of P, it receives a copy whose Date is greater than the Date in the old copy it is updating. Because all cached copies ultimately come from T, the Date in C's new cached copy must be at least A greater than the Date in its old cached copy. Since every update operation increases the Date of the cached copy by at least A, and since the Date of the cached copy is never greater than the current time, on average C can not update P more often than once per A units of time.
Incidentally, it is possible that a given cache will update a page twice during a single span of A units of time. To see this, suppose that A is 10 minutes, and that a cache has a copy of P whose Date is 10:00. At 10:38, the cache may receive a request for P, and might update the cached copy, receiving a version whose Date is 10:30. At 10:41, the cache might receive another request for P, and update its cached copy again, perhaps receiving a version whose Date is 10:40.
Up to now we have discussed HTTP 1.1's expiration mechanism, used to decide when pages in a cache need to be updated. HTTP also provides a validation mechanism which allows a cache to update a page by determining whether it has changed and reloading it if it has not. In HTTP 1.0, a cache can update a page by sending the If-Modified-Since header along with a GET request for the page. The cache places the Last-Modified date of the cached page copy in the If-Modified-Since header it sends. If the page has been modified on the server since the date in the If-Modified-Since header, the server returns the modified page exactly as it would to an ordinary GET request; if the page has not been modified, the server returns a 304 (Not Modified) message. A Not Modified message always contains a Date HTTP header and guarantees that the requested page was not modified before the time specified in that header; a cache which receives a Not Modified message when updating a cache entry is required (see section 10.3.5 of the specification) to update the Date stored in the cache entry with the Date in the Not Modified message. A Not Modified message may contain other headers, such as Expires, which can also be used to update a cache entry.
HTTP 1.1 allows the use of validators other than pages' last-modified dates, and provides the headers ETag and If-None-Match for this purpose. These headers are beyond the scope of the present paper, and from now on we will consider only the If-Modified-Since header for page validation.
There has been some controversy over how to handle If-Modified-Since in a cache hierarchy. Apparently some people feel that every If-Modified-Since request should be passed all the way up the cache hierarchy to the origin server; others feel that If-Modified-Since requests should stop at some point on the cache hierarchy if a cache has a copy of the requested page that is new enough. The HTTP 1.1 specification is unfortunately vague in this respect. In fact, in the definition of cache correctness in section 13.1.1 the specification lists Not Modified responses as being exempt from the freshness requirements placed on responses containing a document; this implies that Not Modified responses may be arbitrarily old! (The specification text does say "4. It is an appropriate 304 (Not Modified) ... response message", but it is anyone's guess just what the adjective "appropriate" means here.)
We feel that If-Modified-Since requests should not always be sent all the way up to origin servers for several reasons. First, many popular servers handle millions of requests per day. If If-Modified-Since requests always go all the way to popular servers, those servers must be big machines on high-bandwidth networks, and must be upgraded every time demand increases. Second, on poorly connected networks it is too expensive to communicate with the origin server every time a page must be updated. Third, and most fundamentally, it is arbitrary and wasteful to assign different freshness requirements to If-Modified-Since requests than to ordinary GET requests.
In our opinion, the most appropriate way to handle If-Modified-Since requests is to give them exactly the same freshness requirements as ordinary GET requests. Then, a cached entry which is fresh enough to handle a GET request without contacting a higher-level server can also be used to handle If-Modified-Since requests. Then the argument presented above in the section "Consistency in a cache hierarchy" will apply and when using a maximum age of T, at most one If-Modified-Since request for a given page will pass between two caches in the hierarchy per T time units on average. In fact, the CERN httpd implementation handles If-Modifed-Since requests in exactly this way, although its implementation has some mistakes [Din96].
Assuming that the HTTP designers intend If-Modified-Since requests to have the same freshness requirements as ordinary GET requests, they need to say so more explicitly in the HTTP specification, and other minor changes will be needed in the HTTP specification as well. First, in section 13.1.1 "Cache Correctness", Not Modified responses should not be except from freshness requirements: they should be subject to the rules 1-3 just as ordinary document responses are. Second, in section 10.3.5, Not Modified responses should be allowed to carry Warning headers, as they themselves may be stale.
When If-Modified-Since requests have the same freshness requirements as GET requests, cache implementations can response to all requests in two stages. In the first stage, an implementation handles expiration, ensuring that its cached copy is sufficiently up to date to handle the request; it can do this without caring whether the request contains the If-Modified-Since header or not. In the second stage, an implementation handles validation, checking the validator in the request against the validator in the cached copy to determine whether the document needs to be sent back or whether a Not Modified response is appropriate. A cache will never consult a higher-level cache in this second stage. The sample Java implementation presented later reflects this two-stage implementation.
Suppose that a client fetches pages through a large cache hierarchy. Caches near the client, such as on a local network or a network that serves a single site, will be accessible quickly but will typically have page versions that are relatively old, because those caches serve relatively small communities of users. Caches further from the client, such as national or continental caches, will be accessible more slowly but will have newer page versions. Especially if the client is poorly connected, the client may wish to send Cache-Control: max-stale = (infinity) along with every page request, so that the client will receive stale page versions quickly rather than having to wait for fresh versions. (This behavior is especially useful to users who navigate through hierarchical Web indices such as Yahoo, and want to find an entry they have seen before without waiting for updated versions of each page to load.) In this situation it is especially important for the Web browser to give the user some indication of a page's age or Date so that the user can assess the timeliness of the information they are viewing [Din96]. It would be especially helpful to users if Web browsers provided a "Newer" button which could be used to retrieve a newer, but not necessarily fresh, version of the page currently being viewed; by pressing "Newer" repeatedly, a user could retrieve page versions from the cache hierarchy in succession, ultimately retrieving a completely up-to-date page version from the page's home server itself.
Indeed, the Cache-Control: max-age and Cache-Control: max-stale headers in HTTP 1.1 can be used to implement such a "Newer" button. To accomplish this, a Web browser can send max-stale = (infinity) on the first request so that a page version of any staleness is retrieved as quickly as possible. That page version has some age A; when the user presses "Newer", the browser sends an If-Modified-Since request for the page with max-stale = (infinity) and max-age = (A - 1 second). The response will be either a new document version or a Not Modified message, and will contain a Date value which is newer than that of the previous page version. A Not Modified response indicates that the previous page version was valid for longer than was previously realized, and a browser could react to such a response by updating an on-screen indication of the page's age or Date.
In our earlier paper "Web Cache Coherence" [Din96], we suggested that rather than providing a "Newer" button, a browser might automatically retrieve and display a succession of page versions for the user; unfortunately, performing this automatically might be more confusing and distracting than helpful to most users. We can imagine a compromise between the two approaches: after retrieving and displaying the first version of a page, if the page version is stale, a Web browser could automatically send an If-Modified-Since request as described above. If it receives a Not Modified response, the Web browser could update the on-screen page age indicator and, if the Not Modified response was itself stale, repeat the process, sending another If-Modified-Since request; surely this dynamic updating of the page age would not bother most users. If, on the other hand, the Web browser receives a newer page version in response to the If-Modified-Since request, it could indicate to the user in some nondisruptive way that a newer page version is available, perhaps by highlighting the Newer button; this would allow, but not require, the user to view the newer version.
We also suggested in the previous paper that the HTTP protocol might be modified to retrieve a succession of page versions automatically in response to every request. While such a modification would improve the efficiency of retrieving multiple page versions through a large hierarchy, it would complicate the HTTP protocol and might actually be less efficient when users don't need new page versions. For these reasons, we now feel that the HTTP max-age and max-stale headers are probably a better mechanism for retrieving stale page versions.
In this section we discuss a number of implementation issues which HTTP cache implementors must address.
In order to fulfill HTTP 1.1's page age requirements, a cache must update a cached page if necessary when the cache receives a GET request for the page. In fact, caches may wish to update cached pages more often. For example, a cache might choose to update all of its cached pages at night when network usage is low, so that fresh versions of those pages will be available the next day. The HTTP 1.1 specification should say more explicitly that caches may update cached pages at times other than when GET requests are received for those pages.
We feel that it is especially important for caches to update pages in one situation where it is not required by the specification: whenever a cache returns a stale page in response to a request containing max-stale, the cache should update the stale page version after returning it to the user (a good implementation could even return the stale version and update it to retrieve a newer version in parallel); we callled this behavior "post-fetching" in our previous paper [Din96]. If caches do not do so, poorly connected users who send max-stale = (infinity) with every request will see the same stale page over and over again if they visit it repeatedly. If users retrieve successive page versions either manually or automatically as suggested above, caches which update stale pages after they have been requested effectively prefetch information which the user will want very soon anyway. Given these advantages, we feel that the HTTP 1.1 specification should recommend that cache implementations update stale pages in this way. Please note that our Java implementation below does not perform this update (and should, and will in a forthcoming version of this paper).
There are several situations in which a cache may return a response which does not satisfy the maximum age requirements expressed by the client or server. HTTP 1.1 specifies a set of warnings to be issued in such situations, but does not define clearly enough the circumstances under which each warning may be issued, and is not clear enough about how caches are to deal with responses from higher-level caches which don't meet the requested requirements.
Section 13.1.1 "Cache Correctness" of the HTTP 1.1 specification says that a response must include a warning "if the freshness demand of the client or the origin server is violated". HTTP 1.1 defines a warning "10 Response is stale" to be issed when the server's maximum age requirement is violated, even if the server's age requirement is explicitly relaxed by the client. Unfortunately, the specification defines no warning to be issued when the client's freshness demand is violated! Such a warning would be necessary if, for example, a client specified a max-age value of 1 hour but a cache returned a page which was 2 hours old, perhaps because it was unable to communicate with a higher-level server. In such a situation, the response might not be stale at all, and so a warning 10 would not be appropriate.
We might consider adding a new warning "Response is older than requested" be added to HTTP; this warning would be issued whenever the client's freshness demand is violated. Then, for example, a client who received a page which was too old, but not stale, would receive "Response is older than requested" with the HTTP response. A client who expressed willingness to receive stale pages by sending max-stale = (infinity), and who received a stale page in response to the request, would receive "Response is stale" but not "Response is older than requested". And a client who was not willing to receive stale pages but received one anyway would receive both "Response is stale" and "Response is older than requested" with the response.
It is not clear, however, how useful such a warning would be. If network delays are significant, then a reponse might become too old en route from the server to the client, even though it was not too old at the time the server began to transmit it. A client who wanted to know whether the requested freshness demand was satisfied, then, could not simply check for the warning message, but would have to explicitly test the age of the response. Furthermore, this new warning could not be cached, unlike other warnings, because its meaning depends on the parameters of a specific request. As an alternative to this new warning, then, we might simply change section 13.1.1 "Cache Correctness" to say that a response must include a warning if the server's freshness demand is violated, but not if the client's demand is violated.
We should also consider changing the conditions under which warning 11 "Revalidation failed" is issued. The specification says that this warning is returned "if a cache returns a stale response because an attempt to revalidate the response failed, due to an inability to reach the server". We feel that servers should probably return this warning when revalidation fails even if the page is older than requested by the client, but not stale, so that clients will know why their demands were not met. We should be aware, howerer, that returning this warning even for non-stale pages may affect its cacheability. Suppose that client A requests a page through a proxy cache C; A sends "max-age = 1 hour". C receives a non-stale page from a higher-level proxy which is 3 hours old and includes "Revalidation failed", and returns the page to A. Client B may then request the same page through C, but sending "max-age = 5 hours". C will not revalidate the page, because the page satisfies B's freshness requirement. But if C has cached the warning and returns it to B, then B will receive the warning even though its freshness demand was satisfied. The simplest solution to this problem is to simply make the warning non-cacheable. There is no danger in doing so, because if a cache receives a response including "Revalidation failed" and caches it without the warning, the cache must attempt to revalidate the response anyway the next time the page is requested by a client who would not be satisfied with the version in the cache.
HTTP makes specific provisions for the case in which a response becomes too old in transit. In section 13.1.1 "Cache Correctness", the specification says "If a cache receives a response that it would normally forward to the requesting client, and the received response is no longer fresh, the cache SHOULD forward it to the requesting client without adding a new Warning". We believe that this provision should be extended to the case in which the response is still fresh, but does not satisfy the client's maximum age requirements.
We must also consider the response to an If-Modified-Since message when the client's freshness requirements cannot be met due to disconnected operation or a network failure. Suppose that a cache holds a cached copy of a page P; this copy has a Last-Modified date of L and has a Date header whose value is D. Further suppose that a client sends the cache a message with If-Modified-Since: S, that the cached copy is not new enough to satisfy the client's freshness requirements, and that the cache is unable to communicate with its parent cache; then three possibilities may ensue. If L > S, the cache should certainly return the document version it has, which is newer than that held by the client. If L = S, the cache should certainly return a Not Modified message with Date: D; although this message doesn't satisfy the client's requirements, it may give the client newer information about the page than the client already has. The case L < S, in which the client has newer information than the cache, is more problematic: this may occur if the client bypassed the cache hierarchy on a previous request for this page. The HTTP specification is vague about this case: a cache could return a 504 Gateway Timeout message, or could return a Not Modified Message with Date: D, even though certainly D < S, along with a warning if appropriate. We recommend the latter possibility, which avoids complicating the protocol, but the specification should certainly be more explicit about this.
Serious problems can arise when different caches in a hierarchy use different heuristics for expiration. To see this, consider a hierarchy in which a parent cache P sets maximum ages heuristically as 50% of the time between a page's Last-Modified and Date values, and in which a child cache C uses an expiration factor of 20% instead of 50%. In this situation, the heuristic expiration time assigned by cache P for each page will be later than the time assigned to the page by cache C. Suppose that a page has expired according to C's expiration heuristic but not according to P's heuristic. When a client of C requests the page, C will send an If-Modified-Since message to P to update the page; P will return a Not Modified message with the same Date as C's cached copy of the page, communicating no new information to C. The situation is somewhat problematic: should C attach a warning to the page and return it to the client? Effectively P has prevented C from maintaining the expiration heuristic chosen by C's administrator.
To work around this problem, a cache which assigns a heuristic expiration for a page can send a Cache-Control: max-age header with each request to update the page rather than relying on the parent cache's expiration heuristic. A cache which does so must compute the maximum page age which will result in a page which is fresh by its own expiration heuristic. Suppose that L is a page's Last-Modified date, D is the value of its Date: header, T is the current time and E is the heuristic expiration factor used by the cache. Then the page expires at time
D + (D - L) E
The page is stale if
T > D + (D - L) E
If it is stale, then its staleness S is
S = T - (D + (D - L) E)
Solving for D, we have
D = (T + LE - S) / (1 + E)
This means that if we update the page and receive a Not Modified message whose Date is D0, and we want the updated page's staleness to be less than or equal to some value S0, then we must have
D0 >= (T + LE - S0) / (1 + E)
Let A0 be the age of the updated page; then A0 = T - D0 and we have
A0 <= V
where
V = (S0 - E(L - T)) / (1 + E)
The cache can specify Cache-Control: max-age = V when updating the cached page, where S0 is set according to the value of Cache-Control: max-stale or Cache-Control: min-fresh in the client's request to the cache, or where S0 = 0 if the client specified neither max-stale nor min-fresh. The client may have specified its own max-age requirement which must be used instead if it is less than V. Then when the cache sends If-Modified-Since to update the page, one of two possibilities may ensue. If the response is Not Modified, the updated page will be fresh enough according to the cache's own expiration heuristic. If the response contains a new document version, it may not actually be fresh enough, because it will have a new Last-Modified date, not the one used in computing V. In this second case, a cache could attempt to update the page once again, but to do so would be risky because it is possible that the page actually became too old in transit: an infinite loop might occur, as would definitely happen if the server explicitly set the page's expiration date to be earlier than its Date (as permitted in section 13.2.1 of the HTTP 1.1 specification). So in this second case it is probably better to simply return the page even though it does not satisfy the client's demands, or to revalidate the page only if it is at least, say, 1 hour old.
Of course, if servers always provide expiration dates for pages then we may avoid all of this complexity. But since most pages don't come with expiration dates today and since different caches will likely use differing expiration heuristics, we feel that the max-age calculation we have just described will be worth implementing, and deserves to be included in the HTTP specification. We included this calculation in our Java proxy implemention (described later).
In section 13.2.6 "Disambiguating Multiple Responses", the HTTP specification gives explicit instructions to a cache which updates a cache entry and receives a new page whose Date header which is older than the one for the existing entry. At first it's not clear how this could ever occur. It's certainly true that a parent cache P could contain older information than a child cache C: this could happen, for example, if the cache hierarchy topology were reconfigured or if a child cache temporarily bypassed its parent. But presumably a child cache C will only revalidate a cached page if its own copy of that page is not new enough to satisfy client demands; how is it possible that C could receive an older page when it passes the same client demands up to a parent cache?
There are at least three situations in which a client might receive older information while updating a cache entry. First, if the client and server have different notions of heuristic expiration then the server may return an older page which it believes to be fresh, even though the client believed its own, newer page to be stale. If the client explicitly computes a max-age according to its own expiration heuristic as described above, however, this will not happen. Second, the server may be unable to satisfy client demands due to disconnected operation or network failure. Third, a server administrator might modify a page to have an expiration date earlier than it had at some point in the past. To see this, suppose that a parent cache C retrieves a page from its origin server on September 2; C caches the page with headers Last-Modified: September 1, Date: September 2 and Expires: October 1. Suppose then that a child cache retrieves the same page directly from its origin server on September 15, caching the page with headers Last-Modified: September 1, Date: September 15 and Expires: September 20; the page's expiration date has moved backwards. Now if a client retrieves the page through the child cache on September 21, the child cache will attempt to update the page through the parent cache; the parent thinks its copy is fresh and will return a Not Modified message with Date: September 2, which is earlier than the Date in the child cache's copy.
The HTTP 1.1 specification recommends that if a cache updates a cache entry and receives older information than it already has, the cache should repeat the update request and include Cache-Control: max-age = 0 to force the cache hierarchy to bring itself completely up to date with respect to the page in question. Normally, a request with max-age = 0 should be considered an expensive operation, because it always travels all the way to the origin server. But the first and third situations listed above are sufficiently uncommon that this resolution is reasonable in those cases. In the second case, an update operation might return a Not Modified response whose Date is earlier than the one it already has, along with a "Revalidation failed" or "Disconnected operation" warning; we mentioned this possibility in the section "What to do when a request can't be satisfied", above. In this case, the client should certainly not repeat the update request, since it will certainly fail again; section 13.2.6 of the specification should address this case directly.
The caching sections in the HTTP specification contain enough loose ends that in studying the text one often wishes they could just "look at the source code". We have implemented a simple HTTP 1.1 caching proxy in Java to be sure that there are no holes in our own understanding of Java, and to serve as a reference for cache implementors. The implementation is not usable as a real proxy because it performs no parsing or text output: it expects HTTP requests as Java objects containing already-parsed data, and returns HTTP responses as Java objects as well. The implementation also keeps the entire cache in memory, has no interface to network sockets, and omits many important HTTP headers. By simplifying the implementation in this way, we were able to fully specify how caching should work in several hundred lines of code.
The most important class in the implementation is the Proxy class, which implements a caching proxy. The top-level method in Proxy is get(), which is called when the cache receives an HTTP GET request. get() takes a Request object as a parameter, representing an HTTP request with all its headers, and returns a Response object.
We wrote several simple simulation classes which we used to test the proxy code. We haven't included their source code here, but will be happy to share them upon request.
The Java implementation still does not implement several features we have suggested here, such as updating stale pages after they are requested. It also does not deal at all with network failures. We plan to address these issues in a revised implementation soon.
import java.util.*;
class Request {
// Below, default initialization values are the values which fields take
// if the corresponding HTTP header is not present.
// method is assumed to be GET
String uri;
Date if_modified_since = null;
Date if_unmodified_since = null;
ETag if_match = null;
ETag if_none_match = null;
int max_age = -1; // Cache-Control:max-age
int max_stale = -1; // Cache-Control:max-stale
// If max-stale is present but no value is specified, this field
// has value Integer.MAX_VALUE .
int min_fresh = -1; // Cache-Control:min-fresh
public Request(String uri) { this.uri = uri; }
}
import java.util.*;
class Response {
// Below, default initialization values are the values which fields take
// if the corresponding HTTP header is not present.
int status; // response status code
Date date; // must always be present
ETag etag = null;
Date last_modified = null;
Date expires = null;
int max_age = -1; // Cache-Control:max-age
boolean warningStale = false; // true if response includes warning 10: Reponse is stale
boolean warningRevalidation = false; // true if response includes warning 11: Revalidation failed
// no Age: header - assume that clocks are synchronized
String body = null; // the message body,
public Response(int status) {
this.status = status;
}
}
class ETag {
String tag;
boolean weak;
boolean strongCompare(ETag e) {
return (!weak && !e.weak && tag.equals(e.tag));
}
boolean weakCompare(ETag e) {
return tag.equals(e.tag);
}
}
interface Server {
Response get(Request r);
}
import java.util.*;
class Proxy implements Server {
// A (model of a) HTTP 1.1 caching proxy.
private Server server; // next higher-level server
private Hashtable cache; // the cache, hashed by url
private double expFactor; // factor for heuristic expiration
public Proxy(Server s, double expFactor) {
// Create a new Proxy which sends requests to the given higher-level server
server = s;
cache = new Hashtable();
this.expFactor = expFactor;
}
protected int freshnessLifetime(Response r) {
// Compute a response's freshness lifetime (13.2.4)
if (r.max_age != -1)
return r.max_age;
if (r.expires != null)
return dateSubtract(r.expires, r.date);
// Compute the freshness lifetime heuristically
if (r.last_modified != null) {
int s = dateSubtract(r.date, r.last_modified);
return (int) (expFactor * s);
}
return Day;
}
private int stalenessAge(int stale, int sinceModified) {
// Given the number of seconds which have elapsed since a page was last modified,
// return a number of seconds A such that if the page is updated to have age A,
// its staleness will be "stale" seconds. "stale" might be negative, indicating the
// number of seconds which will elapse before the page becomes stale.
// Normally, the returned A will fall between 0 and sinceModified.
// If "stale" is negative and large in magnitude, A may be less than 0, indicating
// that even if the page is revalidated to be completely up to date, it still will not satisfy
// the freshness demand.
// If "stale" is positive and large in magnitude, A may be greater than the lastModified
// time, indicating that any Date: value for the page falls within the allowed staleness.
return (int) ((stale - sinceModified * expFactor) / (1 + expFactor));
}
protected int age(Response r) {
// Compute a response's age (13.2.3)
// To keep the model simple, we assume that all clocks are synchronized
return dateSubtract(currentTime(), r.date);
}
protected boolean isStale(Response r) {
return (age(r) > freshnessLifetime(r));
}
protected int staleness(Response r) {
// the number of seconds that have elapsed since the given response expired
return (age(r) - freshnessLifetime(r));
}
protected int freshness(Response r) {
// the number of seconds after which the given response will expire
return (freshnessLifetime(r) - age(r));
}
protected boolean freshEnough(Response resp, Request req) {
// Returns true if the given response is fresh enough to meet the demands of
// the given request
if (req.max_stale == -1 && isStale(resp)) // stale, and the requestor doesn't allow it
return false;
if (req.max_stale != -1 && staleness(resp) > req.max_stale) // too stale
return false;
if (req.max_age != -1 && age(resp) > req.max_age) // too old
return false;
if (req.min_fresh != -1 && freshness(resp) < req.min_fresh) // not fresh enough
return false;
return true;
}
public Response revalidate(Response cached, Request req) {
// Revalidate a cache entry which is not fresh enough to satisfy a given request.
// Returns the newly validated entry.
Request newreq = new Request(req.uri);
if (cached.expires != null || cached.max_age != -1) {
// The page has an expiration date, so the parent cache's notion of expiration coincides
// with our own. Thus, it is meaningful to pass max_stale and max_fresh along.
newreq.max_age = req.max_age;
newreq.max_stale = req.max_stale;
newreq.min_fresh = req.min_fresh;
}
else {
// Heuristic expiration. The parent cache's notion of expiration may not coincide with our
// own, so we'll be more explicit about what we want.
int sinceMod = dateSubtract(currentTime(), cached.last_modified);
int maxage;
if (req.max_stale == -1)
maxage = stalenessAge(0, sinceMod);
else if (req.max_stale == Integer.MAX_VALUE) // any staleness is OK
maxage = Integer.MAX_VALUE; // any age is OK
else // specified acceptable degree of staleness
maxage = stalenessAge(req.max_stale, sinceMod);
if (req.min_fresh != -1)
maxage = Math.min(maxage, stalenessAge(- req.min_fresh, sinceMod));
if (req.max_age != -1)
maxage = Math.min(maxage, req.max_age);
newreq.max_age = maxage;
newreq.max_stale = Integer.MAX_VALUE;
// important so that we can get stale pages from higher-level caches if we want to
newreq.min_fresh = -1;
}
// Note that we do NOT want to pass the if_(un)modified_since and if_(none_)match
// headers on in our revalidation request, since we may hold a different version
// of the page than the requestor does.
newreq.if_modified_since = cached.last_modified;
newreq.if_none_match = cached.etag;
Response resp = server.get(newreq);
if (resp.date.before(cached.date)) {
// We received information that is older than our own. This should almost never happen -
// see the paper for an explanation. If it does, we follow the recommendation in 13.2.6 and
// force the cache hierarchy to refresh itself.
newreq.max_age = 0;
resp = server.get(newreq);
}
if (resp.status == 200) { // OK
cache.put(req.uri, resp); // completely replace old response
return resp;
}
else if (resp.status == 304) { // Not Modified
// update cached response
cached.date = resp.date;
cached.etag = resp.etag;
cached.expires = resp.expires;
cached.max_age = resp.max_age;
// response should not contain other header fields
return cached;
}
// ++ should deal with other response codes here
return cached;
}
public Response get(Request req) {
Response cached = (Response) cache.get(req.uri);
if (cached == null) {
// There was nothing in the cache. Just pass the request on, caching the result as it comes back.
Response resp = server.get(req);
cache.put(req.uri, resp);
return resp;
}
if (!freshEnough(cached, req))
cached = revalidate(cached, req);
// We might return this entry to the requestor, so set a staleness warning as appropriate.
// Note that the response presently in the cache, if just received from a higher-level server,
// might actually not satisfy the client's freshness demands, due to time spent in transit.
// As suggested in 13.1.1 "Cache Correctness", we neither revalidate nor attach warnings to such responses.
// Thus, if the user is asking for a non-stale response, we don't attach a warning to it.
cached.warningStale = req.max_stale != -1 && isStale(cached);
// First deal with If-Unmodified-Since and If-Match. These will not normally be used
// for retrieving documents, and are not very important in this caching model.
if (req.if_unmodified_since != null &&
(cached.last_modified == null || cached.last_modified.after(req.if_unmodified_since)))
return errorResponse(412); // Precondition Failed
if (req.if_match != null &&
(cached.etag == null || !cached.etag.strongCompare(req.if_match)))
return errorResponse(412); // Precondition Failed
// Now deal with If-Modified-Since and If-None-Match, which will normally be used in
// retrieving documents from caches.
if (req.if_modified_since == null && req.if_none_match == null)
return cached;
if (req.if_modified_since != null &&
(cached.last_modified == null || cached.last_modified.after(req.if_modified_since)))
// it has been modified, so return the whole document.
// This may happen even the etags match, if they are weak.
return cached;
if (req.if_none_match != null &&
(cached.etag == null || !cached.etag.weakCompare(req.if_none_match)))
// The etags don't match, so return the whole document.
// It would be surprising if this happened even though If-Modified-Since was specified
// and the document had apparently not been modified.
return cached;
// Now construct and return a 304 Not Modified response.
Response r = new Response(304);
r.date = cached.date;
r.etag = cached.etag;
r.expires = cached.expires;
r.max_age = cached.max_age;
// We could conceivably attach a staleness warning here sometimes, but the HTTP spec forbids it (10.3.5).
return r;
}
// utility routines and values
protected Response errorResponse(int status) {
Response r = new Response(status);
r.date = new Date();
return r;
}
public Date currentTime() { return new Date(); }
public int dateSubtract(Date d1, Date d2) {
// Return the difference (d1 - d2) between the two given Date objects, in seconds
return (int) ((d1.getTime() - d2.getTime()) / 1000);
}
public static final int Hour = 3600;
public static final int Day = 24 * Hour;
}
We have extensively examined the caching mechanisms specified in HTTP 1.1, and we found plenty to talk about even without considering ETag validators and the interaction of caching and content negotiation and ranges. We found numerous areas in which the specification is vague and suggested how to clarify it. Despite our feeling that the specification is not yet watertight, HTTP 1.1's caching is certainly a significant advance over HTTP 1.0 and on the whole, its basic mechanism seems sound: most of our criticisms are related to weaknesses in the specification, not in the design itself. We look forward eagerly to the widespread implementation of HTTP 1.1 caching and to the benefits it will provide.