Improved RTT based peer selection - Overview

By John Moore

I have made some improvements in Peer Selection Algorithm by using FULL RTT. It's real better than original one(more accurate estimate), I think. These functions has been tested for two weeks, and Squid 2.4 works fine with new peer selection algorithm.

Something about FULL RTT

FULL RTT
is the RTT from local cache to parent cache + RTT from parent cache to destination.
p_min_rtt
is the pointer to the parent cache peer with the least FULL RTT if it is available.
p->stats.rtt
is the RTT from local cache to parent cache. (zero RTT is not valid for comparing)
h->rtt
is the RTT from parent cache to destination. (zero RTT will never saved in Network DB)
n->rtt
is the DIRECT ACCESS RTT from local cache to destination. (zero RTT is not valid for comparing)
full_rtt
full_rtt = p->stats.rtt + h->rtt

As figure shows, we should choose a Parent Cache Peer with least Full RTT.

Furthermore, if n->rtt (Direct Acces RTT) smaller than the closest parent peer's FULL rtt and direct access is allowed, we should choose direct access rather than parent peer.

____________       n->rtt           ____________
|Local Cache|<--------------------->|Destination|
------------                        -------------
  ^ p->stats.rtt ______________     h->rtt   ^  
  |------------->Parent Cache| <-------------|
                 --------------

Basic Algorithm:

  1. if HIT (Cache Digest,ICP) we compare n->rtt with p->stats.rtt to find the one with least rtt.
  2. if MISS (Cache Digest,ICP) or using NetDB, we compare n->rtt with full_rtt to find the one with least rtt.
  3. if we cannot know which is better, Parent/Sibling or DIRECT ACCESS, let cache administrator use 'prefer_direct' to choose one.

Other Notes

I've added a new directive named 'minimum_direct_rtt_diff'. Thus I can use Config.minDirectRttDiff for better peer selection.

I've encountered problem about Network Database when the source server is not reachable. For instance, a record in NetDB like this

194.135.30.0        1/   1  2023.0  26.0 immigration.kulichki.net www.kulichki.com
Now we access www.kulichki.com if this network is unreachable for some reasons. A new record is like this
194.135.30.0        1/   2  2023.0  26.0 immigration.kulichki.net
The RTT is still 2023, and 'recv' = 1 / 'Sent' = 2 ' means the ICMP packet miss. Thus, peer selection algorithm still consider '194.135.30.0' is reachable and make a mistake. Now I have to stop Squid and delete 'netdb_state' in cache directory, then start Squid again to make new fresh NetDB.

I think Squid should set the RTT to zero when ICMP packet miss. NetDB should represent the latest network condition for peer selection. The More accurater data NetDB keeps, the better peer selection algorithm works.

Another problem is that NetDB records cannot saved current data of NetDB to file 'netdb_state' before shutting down itself. The current data of NetDB may be useful when startup next time.

BTW, There is no variable named 'hops' in struct _peer.stats . How to add it and make it worked? I wanna use peer.stats.hops to calculate FULL HOPS. In function 'netdbClosestParent',

FULL_HOPS= peer.stats.hops + h->hops

FULL_HOPS can help netdbClosestParent choose Peer with least FULL RTT and least FULL HOPS.


Squid Now! Cache Now! Valid HTML 4.0! SourceForge
$Id: overview.html,v 1.2 2002/09/01 14:50:20 hno Exp $