|
|
|
|
Nowadays, hierarchical organisation of caches is commonly used. The main objective is to overcome the cumbersome paths of communications among servers, trying to improve the response times experienced by end-users. The efforts are mostly focused on the still exponentially growing Web. There is an interest in the analysis of different caching architectures using always the same basic hardware, software and network setups. In order to be able to analyse the influence of architectural changes, caching server QoS specification must be studied.This paper presents an approach to evaluate, from users point of view, Web Caching parameters to be used in QoS characterisation.
Several tools have been developed in order to evaluate QoS in Web Caching. Those tools have been used with a case study and obtained results are analysed.
Keywords: Web caching, proxy caching architectures, QoS, ICP, HTCP, HoT CraP, HTTP.
1 Introduction
2 Basics of Internet object caching
2.1 Simple caches
2.2 Co-operative caches
2.2.1 The role of protocols
ICP
HTTP
2.2.2 Known problems
2.2.3 A new proposal: HTCP
3 Architecture testbed
4 Measuring the Web caching QoS
5 Preliminary results
6 Conclusions and further work
7 Acknowledgements
References
Internet is now a widespread mean of interaction among people from everywhere. The World Wide Web, along with associated servers and browsers is, no doubt, the mostly popular Internet set of applications. At first glance, it seems quite nice, but a more attentive analysis shows some technological problems. As we all know, the Web is resource intensive, consuming a lot of bandwidth when documents are transferred - documents which can be as small as some Kbytes or as big as some Mbytes (specially sound, clip and image files). Of course, we can always think of upgrading circuits for a higher bandwidth, buying faster computers, extend memory, disks... Nevertheless, this solution is almost always economically impracticable, so high can costs grow as compared with the short/medium term benefits; demand is also always growing.
The actual and most commonly used solution to overcome the lack of bandwidth for such a high number of Web requests is Web Caching. This technique uses the knowledge acquired by several analysis on servers access logs and by looking into Web users behaviour, both individually and as members of an organisation, to reduce latency experienced by end users when trying to fetch some documents through their Web-browser [1, 2].
The basic concept of caching is intermediate storage of popular Web
documents close to end users. It's taking advantage of temporal
locality [3] on accesses - for example, in our
University it's very probable that several users will read the morning
e-newspaper titles in a short period of time. Normally, Web documents
are requested much more than once.
Andrew Cormack [4] considers two distinguished types of caches: simple caches and co-operative caches. In simple caching, the communication is only possible hierarchically, through TCP connections; caches at the same level are not accessible. In co-operative caching, all caches can participate in the process of satisfying user requests.
Simple caches are being abandoned because they lead to wastage of both bandwidth and disk space. With this type of caches, if an object is not present, a request will be issued to the cache one level up in the hierarchy. ICP [5, 6] is not used, so it is no longer a good solution.
Co-operative caches, unlike simple ones, admit richer co-operation which make them quite powerful. Nevertheless, there are some unwanted effects that need further analysis.
Next sections are devoted to the analysis of several aspects related with the behaviour of these caches.
2.1 Simple cachesWithout any caching mechanisms, when a browser needs to get an object from a specified host (both present in URL), it will just make a direct connection to that host and tries to retrieve the object. Of course, if the object doesn't exist the end user will receive an error message. One advantage of using caching (either simple or co-operative) is that these messages can be avoided. Most recent proxies can use Proxy Automatic Configuration, PAC. The browser is configured to use a script and it can have pre-configured alternatives for the case where a proxy-cache or origin server is not available. Even thought this technique can increment the response times, this is negligible when compared with the augmented availability of information. PAC is also used for load balancing purposes in clusters of caches.
With simple caching there is the possibility of making requests to a cache. Each time a user makes a request through its browser, a TCP connection is made to the cache server instead of doing a connection to the original server. With this we expect to reduce the time needed to service requests. Within an organisation it is highly probable that more than one user requests the same object. So the cache can intercept requests for the same objects and avoid direct connections to the origin server for each of them, doing only one request. This technique can reduce both the wastage of bandwidth and latency serving requests.
Another useful aspect is that if the contacted cache doesn't have the requested object, it is able to forward the request either to a parent cache or to original server.
However, these kinds of operations have some limitations and problems.
First, the hierarchy can not have more than two or three levels [7] because an object retrieved from an origin server (or from an upper level cache) using intermediate caches will be stored in all the caches used to convey the object to the user. This means that caches at higher levels need a lot of disk space, otherwise objects will be discarded, most of the time using Least-Recently Used technique, before they become stale.
The second disadvantage is the need for a TCP connection (which requires at least the exchange of eight packets) each time we want to retrieve an object. This is quite heavy. A better solution is using ICP for querying neighbour caches. ICP however has its own limitations, most of them caused by the lack of information in ICP headers; just part of the information in HTTP headers are in the ICP ones. In order to solve these problems, ICP Working Group is developing the Hyper Text Caching Protocol, HTCP [8] - also known as HoT CraP.
HTCP messages are richer than ICP ones. Particularly, there are special headers carrying information about caching.
2.2 Co-operative cachesAn institution with several departments may wish to have some caches at the same level (one per department, for example) being able to co-operate among them for serving requests. This co-operation is possible using protocol ICP.
There are several possible methods of co-operation. Depending on the way they collaborate, caches are known as siblings or parents. The difference between these two types of co-operation is straightforward: parents can help to serve a request they receive even if they don't have a copy of that object; siblings can only serve a request if they already have a local copy. The way proxy-caches are organised defines a particular caching architecture.
The protocol HTTP/1.0 [9] (HTTP/1.1 is being introduced [10]), used for Web transfers is quite complex and heavy. ICP, a simpler lightweight protocol, was designed for querying purposes among caches (and HTCP is under development...)
2.2.1 The role of protocolsAny proxy caching server (shortly named cache) able to co-operate with other caches is said to be a peer or neighbour of such caches. Peers admit further classification: both parent/child and sibling/peer relationships.
Communication among peers is, for the time, being accomplished by means of ICP protocol, as stated before.
ICP| As an example let's consider relationships among
peers as pictured in figure 1, where C1 has two siblings
(C3 and C4) and one parent (C2).
Each time cache C1 receives a request, it can send queries
to caches C2 , C3 and C4
using the message ICP_QUERY. (It sends a message to the origin server too,
considering this as part of the selection process to serve a request).
Case one of the peers of C1 had a fresh copy of the requested object it will reply with an ICP_HIT or ICP_HIT_OBJ. If one peer doesn't have the object or it will be stale in the next 30 seconds it will return an ICP_MISS. Figure 2 shows a detailed diagram that explains what happens when a cache receives an ICP-request (opcode ICP_QUERY). The message ICP_DENIED should only occur if the client cache is not authorised to communicate with the cache receiving the ICP-request. In such cases, administrators should contact each other to solve the problem, which normally means changing the configuration file's ACLs. |
|
Another possible message is ICP_MISS_NOFECTH and occurs when a parent cache is not able to forward requests, maybe due to network connection problems. However, this cache continue to receive ICP-queries (to determine when the problems are solved) and is able to serve requests when objects are present in the cache.
Let's look now at how ICP-replies are processed. Figure 3 shows the way ICP-replies are processed in order to select one peer cache for getting a particular object.
The important point here is that some of the fields present in HTTP-headers
are not in ICP queries. So an ICP-reply can indicate that a particular
object is present in cache and fresh (a HIT) and when the HTTP request
is made a response can be issued indicating that the request cannot be
fulfilled. Next sub-section discuss some of these problems.
HTTP/1.1 is progressively being introduced but HTTP/1.0 is still the most used. So, let's stick with this version and analyse some of the options that can affect the behaviour of caching mechanisms. The options here presented are not present in ICP messages but do are in HTTP ones.
Attempts to solve these problems are being made. ICP Working Group proposes a new protocol, the already referenced HTCP. It's purposes are wider, pointing for a complete change in today's philosophy of caching. It's a kind of proposal for migration from client-driven caching (users' requests determine cached objects - if cachable) to "pro-active" caching.
HTCP has full HTTP request and response headers and extra useful caching information headers. Namely, Cache-Location:, Cache-Policy: and Cache-Expiry: headers are particularly important in what concerns caching.
Cache-Location: header adds flexibility. One cache can indicate alternative suppliers for the requested object, augmenting their availability of information.
Cache-Policy: header determines, for instance, if an object is cachable and/or can be shared (similar but more efficient than the Squid.1.1 private/public notion of objects).
Cache-Expiry: header indicates for how long an object can be considered fresh.
In spite of these, some researchers say that the long term solution will be "Adaptive Web
Caching" [12]. Briefly, this technique would
use the theory of communication in groups, taking advantage of IP multicast
and reaching reliability through Scalable Reliable Multicast protocol,
SRM [13].
Commonly used caching architectures have only peers that behave as parents or siblings. Portugal has four top-level domain proxy-caches (There are two in Porto and another two in Lisboa. They are called the RCCN [14] proxies)and most of (if not all) the higher education institutions should have their own proxy-cache co-operating with these top-level caches.
University of Minho has about 1000 teachers and around 14000 students accessing WWW, either by LAN access or dial-up. Our University is sharing the 10 Mbps of RCCN (Portuguese Academic Network) backbone with other education institutions. The link connecting the campus to this backbone has 4 Mbps bandwidth. With the European TERENA project [15] our international connections are quite better but still not enough for the high volume of WWW traffic and still growing.
The easiest and cost effective solution to have better response times is reached, of course, by the use of caching techniques.
The actual architecture is composed by one proxy-cache (proxy-www) connecting the University to the "Internet World" - parent caches or remote servers - and several children (there aren't siblings) are attached to this cache. As there is a firewall, proxy-www is the mean for accessing servers outside that one. "Inside firewall" accesses are done through direct connections.
Keeping the same set of servers and studying the effects of establishing new architectural relations among servers is one goal. Being able to analyse QoS variation, along with those architectural changes is another goal.
There are references in caching literature pointing that most requested
objects are of small size. Believing that size can influence response time,
the following performance analysis considers requested objects distributed
by several categories depending on its size.
There are several ways of evaluating the performance of co-operative proxy-caches. Some of the approaches use information concerning the utilisation of computational resources, such as memory, disk space, cpu usage, ... Other approaches consider bandwidth utilisation or latency perceived by end user.
This section describes a new way of measuring proxy-cache QoS in terms of response time, i.e., how long it takes to serve end user requests.
At first glance, it may seem that computing average response time per request would be easy. However, in order to compute a significant measure, useful to compare performance of different proxy-caches architectures, some other considerations had to be taken into account.
As the objective is testing performance of different architecture configurations
for a particular proxy-cache, some decisions were made:
1st - Only HTTP requests were considered - they are by far
the most relevant for the purpose of determining QoS as almost all the
requests are HTTP;
2nd - Then HTTP requests were split depending on its size, as defined in table 1 (i=0..9);
3rd - In each class, four types of HITs were considered (j=0..3):
| Classe i | Sizes |
| 0 | 0KB-1KB |
| 1 | 1KB-5KB |
| 2 | 5KB-10KB |
| 3 | 10KB-50KB |
| 4 | 50KB-100KB |
| 5 | 100KB-500KB |
| 6 | 500KB-1MB |
| 7 | 1MB-5MB |
| 8 | 5MB-10MB |
| 9 | >= 10 MB |
Other considerations could be done and are still under study. For example our institution has a firewall, but discussion around this is not relevant for the objective of testing architecture performance of the proxy-cache because these aspects of the configuration will remain unchanged, although the architecture will indeed change.
For the purpose of calculating caching server QoS within an architecture we obtain for each classe (i,j) the following values:
Finally, the QoS is obtained doing the following calculation:mean size of objects and correspondent coefficient of variation (CVij,B);
mean response time and correspondent coefficient of variation (CVij,E);
ratio (Pij), defined as the number of HIT in category (i,j) to the total number of requests (UDP_HIT_OBJ plus all types of TCP HITs).
where:
It is well known the mean as a representative measure has some limitations. For example, it is affected by extreme values. So, we complement the QoS, as defined above, by two other measures, CVB and CVE, which show the degree of variability in size and in response time.
They are defined as follow:
The results presented in this section come from the analysis to the TLD proxy cache at University of Minho, proxy-www, and to the RCCN proxy-caches located in Porto.
The access log files of 6 days were used to compute both the previously defined measures and some other useful information. During this period, 7693 requests were HITs (2752 TCP_HIT, 3240 TCP_IMS_HIT and 1701 TCP_REFRESH_HIT) - proxy-www has the default configuration and so it is not piggybacking small objects in ICP reply message 's payload .
The figure 5 shows requests that resulted in any of the referred HITs. It is clear that most objects are small. In fact, approximately half of the 7693 requested objects were smaller than 1 Kbyte.
Figure 5 - Objects ' distribution, considering the three types of
TCP HITs
These requests represent 39,6 Mbytes of transferred information. The table 2 shows the 7693 objects classified by size and time of response. The classes of size considered were those present in table 1 but as the number of requests in categories higher than 50 Kb were very small, these were aggregated - last class includes all objects greater than 50 Kbytes.
The table 3 summarise some of the results - each cell value is the division of the amount of time by the respective number of bytes in category (category's QoS). There are some results that could be expected. Others need further analysis.
The overall performance of proxy-www is characterised by the following values:
|
Sizes
|
|
|
|
|
|
||||||||
| Latency | Req. | Bytes | Req. | Bytes | Req. | Bytes | Req. | Bytes | Req. | Bytes | |||
| < 200 ms |
2016
|
639177
|
1122
|
2669912
|
329
|
2439012
|
237
|
4154608
|
5
|
296045
|
|||
| < 500 ms |
862
|
274329
|
540
|
1265865
|
142
|
1073128
|
147
|
2823041
|
10
|
606643
|
|||
| < 1 s |
367
|
113234
|
238
|
558605
|
84
|
607271
|
68
|
1324845
|
10
|
954271
|
|||
| < 2 s |
234
|
78255
|
187
|
440066
|
48
|
341868
|
37
|
732803
|
18
|
2042829
|
|||
| < 3 s |
74
|
24641
|
70
|
149392
|
24
|
184631
|
19
|
314632
|
6
|
412528
|
|||
| < 4 s |
98
|
32357
|
74
|
170541
|
25
|
188154
|
14
|
291044
|
2
|
198783
|
|||
| < 5 s |
36
|
15871
|
45
|
94420
|
13
|
87386
|
5
|
80234
|
1
|
89972
|
|||
| < 6 s |
17
|
5669
|
30
|
62817
|
4
|
26972
|
5
|
108459
|
3
|
769313
|
|||
| < 7 s |
23
|
10297
|
26
|
51955
|
3
|
19841
|
5
|
130999
|
1
|
167986
|
|||
| < 8 s |
15
|
7087
|
13
|
31297
|
3
|
20404
|
4
|
82524
|
0
|
0
|
|||
| < 9 s |
8
|
2739
|
17
|
40664
|
2
|
12718
|
4
|
57880
|
1
|
54722
|
|||
| < 10 s |
16
|
3410
|
14
|
32904
|
3
|
23902
|
4
|
91460
|
1
|
192111
|
|||
| < 15 s |
76
|
8671
|
34
|
86856
|
10
|
73180
|
9
|
168701
|
3
|
894599
|
|||
| < 20 s |
28
|
3337
|
16
|
28703
|
2
|
17006
|
6
|
105810
|
0
|
0
|
|||
| < 25 s |
11
|
1044
|
18
|
58739
|
0
|
0
|
3
|
49523
|
3
|
1166631
|
|||
| < 30 s |
7
|
505
|
6
|
10141
|
1
|
7145
|
0
|
0
|
0
|
0
|
|||
| < 35 s |
5
|
432
|
2
|
5275
|
0
|
0
|
0
|
0
|
0
|
0
|
|||
| < 40 s |
1
|
0
|
2
|
8192
|
0
|
0
|
0
|
0
|
0
|
0
|
|||
| < 45 s |
2
|
1596
|
3
|
7889
|
0
|
0
|
0
|
0
|
0
|
0
|
|||
| < 50 s |
0
|
0
|
2
|
5435
|
0
|
0
|
1
|
36864
|
0
|
0
|
|||
| >= 50 s |
5
|
336
|
4
|
8420
|
1
|
6390
|
0
|
0
|
3
|
11010237
|
|||
| Total |
3901
|
1,166
|
2463
|
5,520
|
694
|
4,891
|
568
|
10,065
|
67
|
17,983
|
|||
| Classe of sizes |
|
TCP_HIT | TCP_REFRESH_HIT | TCP_IMS_HIT |
| 0KB-1KB |
|
|
|
|
| 1KB-5KB |
|
|
|
|
| 5KB-10KB |
|
|
|
|
| 10KB-50KB |
|
|
|
|
| 50KB-100KB |
|
|
|
|
| 100KB-500KB |
|
|
|
|
| 500KB-1MB |
|
|
|
|
| 1MB-5MB |
|
|
|
|
| 5MB-10MB |
|
|
|
|
| >= 10 MB |
|
|
|
|
The object of analysis is the access logs of two RCCN proxy-caches - those located in Porto: let's call them proxy-1 and proxy-2. For these proxy-caches, we have done a similar analysis to that of proxy-www.
Proxy-1
In the period of analysis, we registered:
Figure 6 - Ratios (UDP_HIT_OBJ in classe / total number of UDP_HIT_OBJ)
and (TCP HITs in class / total number of TCP HITs).
Figure 7 - Ratios of UDP_HIT_OBJ and TCP HITs in each class to the
total number of HITs (UDP_HIT_OBJ + TCP HITs)
|
Sizes
|
|
|
|
|
||||
| Latency | Requests | Bytes | Requests | Bytes | Requests | Bytes | Requests | Bytes |
| < 200 ms |
787
|
456736
|
1101
|
2707581
|
419
|
3245858
|
219
|
2753111
|
| < 500 ms |
0
|
0
|
3
|
9300
|
0
|
0
|
0
|
0
|
| Total |
787
|
0,436
|
1104
|
2,591
|
419
|
3,095
|
219
|
2,626
|
The table 5 present all the objects with size greater than 50 Kbytes
in an unique class. The analysis was done considering all the classes presented
in table 1 but most objects are smaller than 50 Kbytes.
|
Sizes
|
|
|
|
|
|
|||||
| Latency | Req. | Bytes | Req. | Bytes | Req. | Bytes | Req. | Bytes | Req. | Bytes |
| < 200 ms |
38838
|
11640303
|
20248
|
49458736
|
7894
|
58405190
|
6332
|
1,04E+08
|
5
|
280021
|
| < 500 ms |
11637
|
3199219
|
5701
|
14020030
|
2147
|
15815790
|
2564
|
53335401
|
73
|
4365163
|
| < 1 s |
5516
|
1503509
|
3155
|
7800796
|
1139
|
8391094
|
1280
|
26746383
|
152
|
10744962
|
| < 2 s |
3176
|
873056
|
1782
|
4410079
|
621
|
4508009
|
772
|
17032493
|
135
|
12459085
|
| < 3 s |
940
|
221544
|
505
|
1271491
|
161
|
1177468
|
280
|
7256187
|
62
|
7353066
|
| < 4 s |
1324
|
340414
|
808
|
2003640
|
236
|
1705946
|
304
|
6781812
|
42
|
4504515
|
| < 5 s |
937
|
236811
|
490
|
1215035
|
150
|
1089163
|
203
|
5333672
|
46
|
5798444
|
| < 6 s |
337
|
82318
|
170
|
435951
|
67
|
482374
|
104
|
2813837
|
24
|
2314080
|
| < 7 s |
265
|
67014
|
176
|
468644
|
59
|
426424
|
104
|
2674931
|
22
|
4001574
|
| < 8 s |
186
|
52250
|
131
|
336427
|
40
|
286801
|
98
|
2517966
|
14
|
2298462
|
| < 9 s |
181
|
38513
|
79
|
196593
|
25
|
174236
|
62
|
1764400
|
16
|
2402981
|
| < 10 s |
299
|
84315
|
215
|
550845
|
71
|
544394
|
121
|
2866917
|
19
|
1584611
|
| < 15 s |
577
|
150699
|
383
|
1005664
|
151
|
1113924
|
412
|
10712062
|
65
|
10228892
|
| < 20 s |
205
|
57373
|
142
|
363287
|
66
|
488707
|
169
|
4287736
|
55
|
10030421
|
| < 25 s |
171
|
51204
|
123
|
318482
|
43
|
327676
|
145
|
4112146
|
38
|
3529049
|
| < 30 s |
68
|
17621
|
50
|
128607
|
32
|
247491
|
144
|
4139484
|
28
|
6218117
|
| < 35 s |
54
|
13707
|
35
|
93539
|
20
|
149264
|
74
|
2115401
|
21
|
2575284
|
| < 40 s |
31
|
11807
|
20
|
49160
|
15
|
105389
|
56
|
1758705
|
10
|
4573747
|
| < 45 s |
32
|
9448
|
14
|
33132
|
8
|
61554
|
51
|
1484820
|
16
|
5272535
|
| < 50 s |
29
|
8803
|
30
|
74883
|
21
|
166202
|
45
|
1405327
|
10
|
1352410
|
| >= 50 s |
108
|
28760
|
93
|
249152
|
59
|
473194
|
206
|
6841037
|
258
|
1,88E+08
|
| Total |
64911
|
17,823
|
34350
|
80,57
|
13025
|
91,687
|
13526
|
257,834
|
1111
|
276,79
|
Table 6 presents for each type of HIT the QoS in function of objects'
size.
| Classe of sizes | UDP_HIT_OBJ | TCP_HIT | TCP_REFRESH_HIT | TCP_IMS_HIT |
| 0KB-1KB |
|
0,386 |
|
|
| 1KB-5KB |
|
0,131 |
|
|
| 5KB-10KB |
|
0,091 |
|
|
| 10KB-50KB |
|
0,137 |
|
|
| 50KB-100KB |
|
0,543 |
|
|
| 100KB-500KB |
|
0,835 |
|
|
| 500KB-1MB |
|
0,926 |
|
|
| 1MB-5MB |
|
0,974 |
|
|
| 5MB-10MB |
|
0,348 |
|
|
| >= 10 MB |
|
0 |
|
|
The overall metrics for proxy-1 are:
The cache proxy-2 was not configured to use UDP_HIT_OBJ (default). So, only TCP HITs were obtained. The number of HITs was 117072 (49667 TCP_HIT, 28660 TCP_IMS_HIT and 38745 TCP_REFRESH_HIT), representing a total of 583,06 Mbytes. As in proxy-1, most requested objects are smaller than 1 Kbyte (see figure 8).
Figure 8 - Distribution of TCP HITs by size of requested objects
Table 7 shows the distribution of all the 117072 TCP HITs, considering
categories for the latency in response and the size of the requested object.
Similarly to what was done in previous analysis, the objects greater than
50 Kbytes were grouped in one unique class as their number were small.
|
Sizes
|
|
|
|
|
|
|||||||||||
| Latency | Requests | Bytes | Requests | Bytes | Requests | Bytes | Requests | Bytes | Requests | Bytes | ||||||
| < 200 ms |
26979
|
7162463
|
11644
|
28751592
|
4142
|
30793278
|
2422
|
34199568
|
0
|
0
|
||||||
| < 500 ms |
9076
|
2288718
|
3988
|
9804954
|
1321
|
9812004
|
1647
|
32459496
|
20
|
1148483
|
||||||
| < 1 s |
7224
|
1848870
|
3451
|
8372338
|
1121
|
8268915
|
1212
|
23781258
|
89
|
6197528
|
||||||
| < 2 s |
5503
|
1415356
|
2630
|
6467135
|
890
|
6557714
|
1058
|
21418274
|
70
|
5718242
|
||||||
| < 3 s |
2680
|
690539
|
1262
|
3152642
|
443
|
3200613
|
562
|
12045690
|
73
|
5881693
|
||||||
| < 4 s |
2081
|
548553
|
1083
|
2674967
|
286
|
2142776
|
424
|
9075447
|
43
|
3820084
|
||||||
| < 5 s |
1533
|
387771
|
848
|
2126521
|
232
|
1702692
|
306
|
6366384
|
56
|
4727298
|
||||||
| < 6 s |
1097
|
261962
|
520
|
1349524
|
169
|
1219818
|
201
|
4375169
|
39
|
3759956
|
||||||
| < 7 s |
881
|
218487
|
401
|
1043533
|
133
|
969504
|
169
|
3491086
|
25
|
2348168
|
||||||
| < 8 s |
700
|
168534
|
420
|
1080586
|
107
|
802851
|
128
|
2563347
|
25
|
2454140
|
||||||
| < 9 s |
550
|
144692
|
324
|
796025
|
90
|
691081
|
108
|
2376791
|
21
|
1763958
|
||||||
| < 10 s |
557
|
142927
|
321
|
820781
|
92
|
669732
|
113
|
2344428
|
21
|
2698640
|
||||||
| < 15 s |
1870
|
513413
|
995
|
2582698
|
323
|
2356591
|
424
|
8242534
|
54
|
6397551
|
||||||
| < 20 s |
945
|
252220
|
503
|
1333927
|
145
|
1105743
|
228
|
4718127
|
40
|
5362932
|
||||||
| < 25 s |
619
|
183614
|
419
|
1093831
|
102
|
782877
|
141
|
2969950
|
22
|
3615987
|
||||||
| < 30 s |
429
|
108802
|
276
|
718844
|
66
|
484815
|
102
|
2248433
|
18
|
3256096
|
||||||
| < 35 s |
331
|
87344
|
230
|
622545
|
60
|
445342
|
66
|
1453848
|
22
|
4467137
|
||||||
| < 40 s |
243
|
63749
|
130
|
355850
|
40
|
305756
|
56
|
1326127
|
8
|
2157886
|
||||||
| < 45 s |
191
|
47996
|
117
|
340100
|
19
|
148739
|
34
|
844918
|
11
|
4779051
|
||||||
| < 50 s |
211
|
47145
|
120
|
342999
|
35
|
255912
|
46
|
1074825
|
11
|
2719729
|
||||||
| >= 50 s |
1917
|
487754
|
1097
|
3058104
|
230
|
1692962
|
322
|
7651456
|
193
|
1,85E+08
|
||||||
| Total |
65617
|
16,280
|
30779
|
73,328
|
10046
|
70,963
|
9769
|
176,456
|
861
|
246,029
|
||||||
The QoS by categories for proxy-2 is given in table 8.
| Classe of sizes |
|
|
|
| 0KB-1KB |
|
|
31,218 |
| 1KB-5KB |
|
|
|
| 5KB-10KB |
|
|
|
| 10KB-50KB |
|
|
|
| 50KB-100KB |
|
|
|
| 100KB-500KB |
|
|
|
| 500KB-1MB |
|
|
|
| 1MB-5MB |
|
|
|
| 5MB-10MB |
|
|
|
| >= 10MB |
|
|
|
The overall metrics are:
The proposed measures may give some information about performance, in terms of response times to requests.
However, there are some aspects that need further analysis. Probably, the category of requests below 200 ms should be split in order to have more detailed results; this approach reveals a lot of requests aggregated in a single category (which means loosing information). The presented latency times are absolute values; perhaps relative values, in mili-seconds per byte transferred, could be more useful.
In what concerns objects' size, there are some research pointing that requested objects with size greater than some number of standard deviation should not be considered. Probably this may be a better solution. However, it seems difficult to determine the optimum number of standard deviations. This needs further studies.
Another, not negligible, aspect is the day time at which requests are made. It's known, for instance, that accesses are faster at late hours and slower at working hours. For these reasons, probably day time should be considered in the analysis.
In spite of all, for the purpose of tuning one particular cache for better performance, i.e., choosing a better architecture, it is believed that these results can give important help.
At Universidade do Minho, planned experiences will evaluate the performance of ICP multicast based architectures and results will be compared amongst different architectures (actually, there no use is made of multicast and proxy-www has only parents).
Also interesting, but maybe difficult to achieve, would be characterisation
of access patterns. Knowing the characteristics, at least some, of the
Portuguese community's access patterns to WWW could be rewarding. This
knowledge could be very useful for international or transcontinental accesses
- load balancing could be done based on rigorous data and the use of pre-fetching
(It's not widely accepted that it can improve HIT ratio and some cache
administrators dislike it because it overloads cache servers) could improve
the response times of end user. Caches could be specialised by domains.
This work would not be possible without the collaboration of RCCN and
CIUP people, who kindly fed us with its Porto's proxies access log files
for analysis. Thanks to all, specially Rogério Reis and Carmen at
CIUP.