Relation analysis, cache meshes

Work in progress, please email me if interested

Ingrid Melve, UNINETT

Threshold

Web proxy cache systems with several servers currently have no well documented ways of evaluating the relationships between cache proxy servers, for gains in terms of money, latency and bandwidth. This work intends to find a threshold value for relations between servers, based on bandwidth saved, latency reduction and local cost for Internet access.

This threshold will be of interest for work on auto configuration of large multi server proxy cache systems.

The web cache proxy software used is Squid. Co-operating web proxy cache servers using sibling mode is used to illustrate the principles. The work may easily be used to analyze parent relations. Local proxy performance is not considered, only traffic into the proxy.

Intro

proxy
An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Proxies are often used in firewalls and as gateways for handling requests via protocols not implemented by the user agent.
cache
A cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any client, server or proxy may include a cache.
sibling
caching server participating in caching mesh, sends/receives requests to other cache servers
parent
neighbor/sibling caching server through which misses are resolved
relation
relation is used for the object exchange between two caches
object domain
all the objects that the cache asks a sibling for
hit rate
web object served from cache, as a proportion of total number of requests
latency
time before a request is served
Family relations in Squid are either sibling or parent. A relationship is formed if a cache is willing to serve another cache. Relations are based on object domain criteria, which may be for all documents or for documents in a subdomain like .com. Relationships may follow complex rules. The simplest relationship is to send request for all documents. No relation equals no request from sibling cache. Relations per domain are used in COM-MESH.

Cache managers need some way of evaluating relationship. Today this is done on the basis of analyzing web cache logs for various parameters. There is no easy or consistent analysis.

My proposal is to form a "magic number" per relationship that is used to evaluate the relations. This may be used to automatically reconfigure relationships, to terminate relations if the gain drops below a threshold.

COM-MESH

There is an urgent need for analysis of COM-MESH, we need to distinguish between good relation and the bad relations. If we want to use asymmetrical relations, we need a way to evaluate and compare with symmetric relations.

Gerhard Winkler, ACCOnet, has found that hit rates drop over time, since cache content gets homogenized. One solution to this is to use the proxy-only mode, another to establish a round-robin solution and change relations to the next on list once a cache drops below the threshold. He needs a threshold to start with.

The simplest solution is to look at bytes saved per relation. This approach is promoted by NetCache

Assumptions

why cache

Basic assumption is that the two reasons for running a web cache are

Other gains from web proxy caching, such as anonymization and access control are not looked into.

latency

Latency reduction is important for clients placed a long distance from the web servers, as setting up a TCP connection takes time (1.5 round trip times). Most popular web servers are placed in the US, this is far away for non-American users.

Time to look up if the sibling has object is negligible compared with download time. Typical value is 2 orders of magnitude smaller.

non-persistent connections

Assume non-persistent connections, as used by HTTP/1.0 Even though some browsers implement Keep-Alive, Squid/1.1 does not support this.

cache system

Cache network

The figure shows a cache system where the green part is clients requesting documents from web servers in pale blue. The primary proxy cache is shown in brown, and its sibling in blue.

Documents requested through the proxy is either served from the cache, directly from the origin web server (solid red line) or from the sibling (blue line). The cost of getting an object from the sibling should be lower than getting an object directly.

real time or retrospect?

Calculations may happen in real time or evaluation may happen every night (or all nights except weekends). There is still work to be done to evaluate if continuous calculation is feasible.

Factors

Basic components for this threshold value are

  1. a factor representing the delay (compared with general delay for the object domain)
  2. traffic volume: traffic flow into the proxy from sibling, and traffic flow into the proxy from origin web servers
  3. two cost factors representing the cost for getting objects directly as opposed to via the cooperating cache, measured in traffic volume and delay.

Delay

The optimum latency measurement would be to measure median response time for hits during peak load and median response time for misses during peak load. Different response types should be separated, and emphasis placed on HTTP "200 OK" responses.

Performance issues related to computation of real-time median suggests that using weighted mean (throw away values larger than 3 times the standard deviation) yields a good enough result.

Experience in network measurements indicate that peak load may be chosen as the hour which has most load, counting Monday to Friday. Weekend traffic is significantly lower i UNINETT, students seem to take time off to party and faculty probably writes scientific papers. This may be different for other network environments.

Dimensioning should be done using the extreme conditions where web traffic is at its high water mark (high water is when your feet get wet and you feel uncomfortable unless you move on to a better place (or higher bandwidth)).

Delay with sibling is calculated by adding together delay for hits and delay for misses and divide by number of requests. Delay for hits is given by weighted mean multiplied by number of hits. Delay for misses is given by weighted mean multiplied by number of misses.

Suggested formula for delay when using web proxy cache sibling:

D = Ds*Th + Dw*Tm

D = Delay
Tc = Number of object domain HTTP connections
Th = Number of hits in object domain
Tm = Tc -Th
Ds = Delay for sibling hits
Dw = Delay for object direct from web origin server

D is measured in peak hour.

Delay without sibling is approximated by calculating delay for misses in object domain multiplied by number of requests.

Suggested formula for delay without web proxy cache sibling:

D = Dw*Tc

D = Delay
Tc = Number of object domain HTTP connections
Dw = Delay for object direct from web origin server

High document hit ratio gives low latency (there is a trade off between latency and high byte hit ratio). The document hit ratio is implicitly handled by considering latency.

Traffic volume

Traffic volume to proxy is calculated for traffic from web servers directly to proxy (misses) and for traffic from sibling (HTTP hits and ICP responses). The ratio between direct traffic and sibling HTTP traffic is the net byte gain. The ICP traffic is added to compute the total traffic flow from sibling to proxy. If cost of traffic from sibling is low, the ICP traffic may be neglected.

The total traffic volume is composed of ICP traffic from sibling, HTTP traffic from sibling (hits) and HTTP traffic from web servers (misses).

ICP traffic from sibling is response to ICP requests. All requests for documents result in an ICP response-request to sibling (unless there is a hit at the proxy, but that case falls outside the scope of this document). Using an average object size of 3kB yields a break even of 3.3% object hit rate before the ICP traffic exceeds the HTTP traffic on the link to the sibling. Each ICP request/response uses on average 128B, 64B per request.

Traffic needs not be symmetric, the traffic volume is calculated only flowing into the web cache. Similar calculations may be done for traffic flowing to sibling cache, this is left out to simplify calculations. Most Internet connections have a loop sided traffic profile, importing more than they export. Normally connections are full duplex, and the dimensioning factor is traffic flowing into the network. In this case, outflowing traffic needs not be considered.

Cost

Cost factor for an ISP may be cost of international connection as opposed to local cost of bandwidth, or it may be cost per byte transferred out of cache. For a local sysadmin the cost factor may be the access line cost compared with local traffic exchange cost. Finding this factor is not necessarily easy, as some costs are not paid at the same organizational level as the cache servers are run.

Cost/benefit analysis from DESIRE project gives examples of costs for international versus national bandwidth.

Example: Cd = cost of manhours

Cd is a cost factor for delay, an ISP is likely to set this to zero (unless there is a demand for faster web access).

Example: Ct = cost of international bandwidth / national cost
Ct is a cost factor for traffic volume, an ISP is likely to but most emphasis on this.

Cost factors for traffic volume:

Total cost of traffic volume is given by multiplying traffic from sibling with Cts and multiplying direct traffic by Ctw. Traffic from sibling is composed of ICP responses (see above) and HTTP hits. If the cost of sibling traffic is neglible, the cost of traffic is equal to the cost of direct traffic. Comparing this number to the case where all traffic is direct, gives the gain of web caching.

Proposed formula

Basic components for this threshold value are

  1. Delay: a factor representing the delay (compared with general delay for the same documents)
  2. Byte gain: Total traffic in object domain, gain from sibling. May include calculations on ICP overhead and traffic from sibling.
  3. Cost: cost factors representing the cost for getting objects directly as opposed to via the cooperating cache or the cost of delay
The two first may be computed from Squid logs, the last is local to each site.

Suggested formula for cost when using web proxy cache sibling:
(Ds*Th + Dw*Tm)*Cd + (Tb - Hb)*Ctw + Hb*Cts + Ib*Cts

Suggested formula for cost without sibling:
Dw*Tc*Cd + Tb*Ctw

Results common for object domain

Tc = Number of object domain HTTP connections
Th = Number of hits in object domain
Tm = Tc -Th Number of misses in object domain
Dw = Delay for object direct from web origin server
Tb = Total object domain HTTP traffic in bytes
Ib = 64B*Tc ICP traffic in bytes (64 bytes per connection)
Tc = Number of object domain HTTP connections

Various cost factors (individual per sibling)

Cts = Cost of traffic from sibling
Ctw = Cost of traffic direct from web origin servers
Cd = Cost of delay

Results per sibling:

Ds = Delay for sibling hits
Hb = Byte hit rate

Ds and Dw is measured in peak hour.

Simplified formulae

Volume (simplify)

When considering only bandwidth savings, the cost factor Cd is set to zero. The simplified result is

(Tb - Hb)*Ctw + Hb*Cts + Ib*Cts with sibling
Tb*Ctw without sibling

If cost of sibling is neglible (i.e. same LAN), Cts may be set to zero, which results in

(Tb - Hb)*Ctw with sibling
Tb*Ctw without sibling

Subtracting these two to find total saving yields

Hb*Ctw

Byte hit rate multiplied with cost of connection if not hit.

Delay (simplify)

Neglecting traffic volume, and setting Cd = 1 gives

Ds*Th + Dw*Tm with sibling
Dw*Tc without sibling

Subtracting these two to find savings results in

(Dw - Ds)*Th

Total number of connections multiplied with median delay difference for hit and miss.

Cost/benefit

DESIRE report

cost factors

Estimate of cost for traffic volume The relative figures are from SURFnet 1997 as reported in DESIRE Report on the costs and benefits of operating caching services.

These cost figures shows that in the European context is makes sense to ignore all national and European costs.

example

www-cache.uninett.no 28.08.97

Using Calamaris to extract numbers per day per object domain (thanks to Ernst Heiri for his patch). Look at results for .com domain. Chose to use this date, as there are other results gathered from other web caches that may be used for comparison (at a later stage).

Results common for object domain

Tc = 78364
Th = 32194
Tm = 46170
Dw = 5.25 s
Tb = 867258 kB
Ib = Tc*64B = 5015 kB

Various cost factors

Cts = 1
Ctw = 250
Cd = 0.05 NOK/s (estimate)

Total for all 8 siblings

Ds = 5.19 s
Hb = 56686 kB

The first question is if there is any overall benefit from the sibling relations of this cache. Computing the byte cost with siblings (using Ib as ICPsize multiplied with the number of siblings) gives a byte cost of (Tb - Hb)*Ctw + Hb*Cts + Ib*Cts with sibling and a byte cost of Tb*Ctw without sibling. Assumes that all siblings have the same cost factors (this is not entirely correct as one sibling is located in the US and the rest in Europe). The cost gain from the siblings is 14074694/216814500 = 6.5% The extra byte cost introduced by the siblings is not significant for values of Ctw > 1.7*Cts.

Cost of delay for the overall configuration is given by (Ds*Th + Dw*Tm)*Cd, whereas the delay without sibling may be estimated by Dw*Tc. The savings are 1933 ksec, which amounts to some 96650 NOK (if time is money).

If the cost factors are Ctw = 10, Cts = 1, the byte gain is significantly lower, around 0.2%. The complexity of the configuration does not justify such a small gain.

Results per sibling:

cache.nic.surf.net

Ds = 3.31 s
Hb = 9416 kB

For values of Ctw > 1.53*Cts there is a gain for byte cost.

cache.nic.surfnet.nl 28.08.97

Results common for object domain

Tc = 202267
Th = 87788
Tm = 114479
Dw = 9.1 s
Tb = 2563825 kB
Ib = Tc*64B = 12945 kB

Various cost factors

Cts = 1
Ctw = 250
Cd = 0.06 NOK/s (estimate)

Total for all 7 siblings

Ds = 6.68 s
Hb = 303856 kB

Results per sibling:

www-cache.uninett.no

Ds = 5.94 s
Hb = 34899 kB

Analysing relation with www-cache.uninett.no gives a reduction in delay and cost. For values of Ctw > 1.37*Cts there is a gain for byte cost. There is a higher gain for this relation than for the opposite relation, the gain is not symmetric.

Further work

More work on all relations involved in COM-MESH needs to be done, to properly evaluate the gain from such a large scale mesh. All relations for the given day should be calculated, as the data is available.

Calculations need to be tested in real time, to see if this framework is applicable to real time "best sibling" selection mechanisms. This may be one step on the road to mesh autoconfiguration.

References

Squid
Web proxy cache software from NLANR
Cost/benefit analysis
Andre Jong, Henny Bekker, Ton Verschuren, Ingrid Melve, DESIRE report, November 1997
COM-MESH
Experiment in TF-CACHE, meshing national academic networks in Europe
Calamaris
Squid log analyser by Cord Beermann

Ingrid Melve
Last modified: Wed Apr 15 17:31:06 1998