Maciej Kozinski
Nicolaus Copernicus University of Torun
Overview of W3cache in Torun - current state and future development
If we talk about caching in Torun, we are determined by local circumstances: Torun's place in the network topology and number of users. Torun's link outside has still the bandwidth of 256 kbps and local Internet community is over 5000 users. So if we want to have good Internet access - we have to use caching.
Historic overview
The history of caching in Torun starts in early 1994, when an Alex server was set up in Nicolaus Copernicus University (lmain participant in development of Metropolitan Area Network). The server caches FTP requests, takes the requests and give the responses as NFS filesystem, what gives the additional functionality to the user. The final installation has run since early 1995 and it is still active and useful to the users.
Caching was also tried by using a number of independent CERN httpd installation. These servers acted also as proxy caching servers. They were never used on a large scale due to their limited functionality in hierarchical caching and experimental profile.
These all attempts were made in anticipation of future's traffic. I think this was good approach, cause we have benefits from this today.
In the end of 1995 and beginning of 1996 two of our servers running Harvest cached 1.4 were connected to the Polish w3cache hierarchy, belong to level 1 of cache hierarchy. This was the beginning of a new aproach to massive traffic optimization in Torun.
Current state
Currently we explait two w3cache servers in Torun. They provide service to Torun's MAN (TORMAN) and the still for the region due to lack of similar installations in Bydgoszcz or Wloclawek. Both servers are based on Sun's Sparc platform (SparcServer 1000E and SparcStation 20MP) running Solaris 2.5 and SunOS 4.1. The cache software used on both installations is Harvest Cached 1.4. Common space destined to cache files is 3.2 GB. Both servers are connected to MAN's backbone through direct ATM link (155 Mbps). They act as neighbors, also having w3cache server from Gdansk as a neighbor. The are two parents: both in ICM, Warsaw. The number of clients is hard to determine , but it is approximately 200-250 workstations constanly connected to one server (alex.man.torun.pl) from four LANs as well as single workstations from other LANs. There are two servers from commercial providers connected to our installations.
Our experience
During more than two years of experience with caching software and nearly a year of participating in the w3cache hierarchy we acquired some experience with tuning servers and advertising the service. In our opinion these two tasks are equally important for a success of w3cache. Specific profile of our environment, especially the number of people responsible for network installations, made w3cache servers very popular and useful. It is possible to confirm that fact looking at the statistics of one server, which serves at least for full LANs with 200-250 workstations. The amount of data served through cache was (June 1996) bigger than for one-only servers in big towns having relatively good (2Mbps) link to the world. It proves that advertising and popularity of w3cache service is very important for its success. We could make an impressive savings having lot of clients, even still having average hit ratio. It is visible in our statistics.
Tuning the cache fine is the other part of system administrator. Some local requirements are very specific. e.g., our University's Library purchased license for physical electronics magazine provided through Web , which permits only two caching servers to access as a client. We had to exclude the rest of cache hierarchy from caching these documents. Some problems of tuning are presented below.
Problems
Having the cache run means that you solved some problems and have new ones. Except typical problems like unstable software we discovered some very specific.
One of them is handling very specific documents like static ones generated daily. Everything goes right until that kind of document is server by old http server not supporting HTTP 1.0 protocol. It means that the server does not provide additional information like "Last-Modified" field. Server handles such a document with TTL value from "access-per-type" table, which usually contains period longer than a day for documents taken by http. The only solution in our installations was putting the document into "http-stop" list. We hope that the problem expire with server's modifications and software development.
It is also very annoying that we set up servers as neighbor or parent and cached does not recognize it as active. Many times we discovered that we have access to server marked as "down" in cache status report using such a tools like traceroute or ping. Upgrading cache software would be probably the solution.
These technical problems are relatively small in comparison with organizational ones. The main problem for optimizing traffic in Torun now is having the only route to the world through Bydgoszcz and sharing the line from Bydgoszcz with Gdansk. Network development in Bydgoszcz requires setting up strong w3cache there. If there is no cache there is a threat of large traffic over the line to Warsaw.
Though the main user of Internet in Torun is Nicolaus Copernicus University, the number of commercial customers connected to TORMAN is still increasing. They should also use the w3cache due to using shared domain, which is link outside. There is still a problem of calculating the costs and the final price of using w3cache. We discussed it several times, but there is still no good solution for all.
Plans for the future
The most important things we have to do for Torun's w3cache are:
All these things should make the w3cache more effective, reliable and easy to use.
Summary
Having good w3cache access is a chance for the sites being far away from main Internet routes. Torun is such a place. We are in the middle of the way of our local w3cache development. We have paid much attension to set up the the servers, but these installations still need attention, funds, development and advertising. We hope that this point of view will be shared by our environment and we expect good support and understanding from the local community to provide this service better.