Caching in the Washington State K-20 Network
Justin Pietsch
pietsch@cac.washington.edu
Networks and Distributed Computing
Computing & Communications
University of Washington
In 1996 the Washington State Legislature appropriated $42 million for the
K-20 Educational Telecommunications
Network. Under the guidance of the K-20
Telecommunications Oversight & Policy Committee the network is
to be ". . . an integrated and interoperable educational technology
network serving kindergarten through higher education and promoting access
for Washington citizens." Phase I of the project will create a network
for the six public
State Baccalaureate Universities, the 34
Community and Technical Colleges (CTC) and the nine Educational
Service Districts (ESD). During Phase II, the project will extend the network
to the 294 school districts statewide.
Three district networks, K-12, CTC and Baccalaureate, will be combined
to form the K-20 Educational
Telecommunications Network. Each of the three networks will be connected
via the Seattle
Network-to-Network Access Point. Each Phase I network node will be
connected to the hub site by inverse-multiplexed DS1s. Initially the connected
institutions will have 1.5 - 6 Mbps of bandwidth. This network is expected
to be developed over the next several months, in time for the beginning
of the 1997-98 school year.
From the beginning, the network was designed with caching in mind. Both
performance and bandwidth conservation were critical factors in the network
design. Caching servers will be placed at each of the Phase I nodes. Each
sector will use Pentium
based PCs. As the total number of networks nodes is greater than 45,
we can use redundant cache servers if we use less expensive machines and
because with Intel based machines we have many options for software. Each
of the planned machines will have 166 MHz Pentium Processors with five
2 GB disk (8 GB for caching, 2 GB for the system) and at least 128 MB RAM
(most will have 192 MB RAM.)
Currently the K-12 and the Baccalaureate caching hierarchies will be run
together by Networks and Distributed Computing (NDC), which is a part of
Computing and Communications (C&C) at the University
of Washington. The caching for the CTC networks will be run by the
Communications Technology Center (CTC).
The K-12 and Baccalaureate caching servers will use Squid
caching software running on the Linux operating system. The CTC caching
servers will use Netscape
Proxy Server running on
Microsoft Windows NT Server. (For education institutions, Netscape
software is free and Microsoft Windows NT Server is $45.)
Each of the three networks of caching servers will have parents at the
K-20 hub site. The parents also will be run by NDC and will be based on
Squid and Linux. Each of the three parent groups will use the other groups
as siblings. The current plan for implementation for the K-12 and Baccalaureate
caching servers is that at each network node there will be two Squid caching
servers which will be siblings of each other. Each of these will point
to their parent cache servers.
This K-20 cache deployment will provide a testbed for concepts of cache
hierarchy and cache siblings. We are very much interested in the idea of
using many less expensive caching servers rather than a few- high end caching
servers. If this proves out well, then it will be very easy and inexpensive
to add more machines when the load gets higher, rather than having to replace
or upgrade machines.
In the case of the K-12 network, from the top of the hierarchy at
the K-20 hub site, to the ESD, to the school district main office and then
finally to the individual school building is four levels of caching hierarchy.
As of the writing of this paper, we have little experience in caching hierarchies.
NDC has been running several caching servers for over six months, but the
clients connect to the caching servers via Ethernet and the University
is directly peered with our ISP, NorthWestNet
which has multiple DS3 connections. The difference in latency and bandwidth
in the K-20 network could prove to be significant and the work to make
the hierarchy successful could prove to be more involved than the work
of a hierarchy on our own campus.
Another critical factor is how to get K-20 end users to utilize this caching
service. We do not anticipate blocking of port 80 outbound or using other
means of force to entice end users to use the servers. This means that
the end users must elect voluntarily to use the caching servers. We intend
to work with representatives at each sector to explain the benefits of
caching and then we hope for more wide spread ulization.
Still another detail that has not been resolved is the role which the individual
campus caching servers will play. We do not know if they will they in turn
be parents for each site's own caching servers or if individual clients
will point to K-20 caching servers or both. The caching requirements a
large university with over 20,000 students will be different from those
of a community college, which will in turn be different from those of an
ESD.
We have an exciting opportunity to learn about caching in this project. How
well does a caching hierarchy really work? How much different will each
of the three networks be to the caching network? How well does Netscape
Proxy Server use Squid as a parent? How does one organization manage 30+
boxes all across the State without leaving the central office?
last modified 4/25/97 by Justin
Pietsch