Cache Awareness Caching is very beneficial from the standpoint of both the end user who must wait for objects on the Web and the entire user community who must contend with busy servers and congested networks. This situation has worsened as the population of the Internet community has grown at a rate faster than the infrastructure can support. To be sure, the infrastructure will continue to improve but I believe that the content will continue to increase at a pace equal to or greater than the improvements in the infrastructure. I am reminded of a University Professor in 1980 who challenged Software developers to be able to keep up with the increases he saw coming in processor power and memory density. I think that Software developers who deliver products on multiple CDs that take too long to load and run have more than met that challenge. In the same light I believe that increased use of audio, video, and improved graphic resolution, combined with growing demand, will continue to outstrip improvements in the network infrastructure. Given the current responsiveness of the Web and the assumption that volume and demand will continue to lead bandwidth, caching becomes even more critical with time. Caching is an on demand and automated way of getting objects stored closer to the users who want them. Instead of a user in Tokyo having to drag an object across the ocean from California, and his next door neighbor having to do the same thing, a cache allows the object to brought over once. The second reference can be satisfied by a cached version of the first response. Along with the importance and the benefit of caching, there are some steps that can improve the effectiveness of caching and increase the number of cacheable objects. To get the most out of caching there are a number of simple things that can be done. This paper outlines those simple things. Increase the use of shared caches. Shared caches, at the workgroup or larger organizational level, can be much more efficient than a private cache. Shared caches allow users to take advantage of information that their neighbors have already accessed, thus allowing them to more effectively use disk space allocated to caching. Use Last Modified Date headers always and Expires headers whenever possible. These are the dates that caches use to determine how long an object can be cached. The Expires header provides explicit control over how long an object can exist in a cache before becoming stale. Unfortunately the Expires header is not often used in the current implementation of the protocol. The Last Modified Date is much more common and is routinely used as a heuristic in determining staleness when the expires header is not provided. When the Expires header is not provided, the object lifetime is usually determined to be a percentage of the difference between the current time and the object’s Last Modified Date. Maintain correct clock times on workstations and servers. Time is very important in caching since objects must only be kept in cache for a specified length of time before going stale. If times are not maintained correctly and consistently then unnecessarily stale objects can be served or valid objects can be unnecessarily thrown away before they are stale. Share links to common graphics. At one point in time, one survey showed that the most common object on the Web was the “Netscape Now” logo bitmap. This was referenced by a large number of sites as everyone wanted to be Netscape compatible. Many sites made a copy of this bitmap on their server and had their objects refer to the local copy. Since each local copy has a unique name from the standpoint of a shared cache, each one is retrieved from its local server and cached separately, where only one copy really needs to be cached. Structure web objects with caching in mind. A common example where this is not done is with objects that contain visitor counters. Instead of imbedding the counter in an otherwise static object and making the whole thing not cacheable, the object should contain a link to the counter. That way when the object is loaded it can load the counter, but the object itself can be cached while the counter is not. Of course, a more cache friendly approach would be to not use a counter to begin with. Avoid browser specific web pages. Try to make web pages independent of the browser. This avoids having basically two copies of pages. There are features in Java and HTML that can be used to support browser intricacies and still allow the document to be cacheable in a shared cache. In general, these a some of the things that can be done to make objects more cache friendly. These small things can have a big impact on end user response time and the overall effectiveness and efficient utilization of the Internet.