Re: [RFC] byte hit ratio

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 08 Feb 2012 18:54:25 -0700

On 02/07/2012 03:52 PM, Amos Jeffries wrote:
> On 08.02.2012 07:48, Alex Rousskov wrote:
>> On 02/07/2012 05:00 AM, Amos Jeffries wrote:
>>> On 7/02/2012 9:40 p.m., Henrik Nordström wrote:
>>>> tis 2012-02-07 klockan 14:01 +1300 skrev Amos Jeffries:
>>>>> We have a long history of questions and bugs mentioning negative
>>>>> numbers in the byte hit ratio.
>>>>>
>>>>> I've always thought it was a bug we had not tracked down, but the FAQ
>>>>> says it is correct.
>>>>>
>>>>> http://wiki.squid-cache.org/SquidFaq/InnerWorkings#Why_do_I_see_negative_byte_hit_ratio.3F
>>>>>
>>>>>
>>>> Yes.. it's based on the difference between traffic squid<-servers and
>>>> clients<-squid. This can be negative (more traffic squid<-servers than
>>>> clients<-squid) in some situations.
>>>>
>>>> - retried requests
>>>> - range retreival being processed by Squid
>>>> - continued download after client disconnects (quick_abort_...)
>>>
>>> Wiki also mentions cache digests but ...
>>> " /*
>>> * This ugly hack is here to prevent the user from seeing a
>>> * negative byte hit ratio. When we fetch a cache digest from
>>> * a neighbor, it gets treated like a cache miss because the
>>> * object is consumed internally. Thus, we subtract cache
>>> * digest bytes out before calculating the byte hit ratio.
>>> */
>>> cd = CountHist[0].cd.kbytes_recv.kb -
>>> CountHist[minutes].cd.kbytes_recv.kb;
>>> "
>>
>> I think that hack should be removed (why would we want to lie about
>> bandwidth usage?) but I may be missing some deeper reasons why it was
>> added.
>>
>>>
>>> Which one is inaccurate?
>>> "Hits as % of traffic sent" with calculation of (net traffic / client
>>> bytes)
>>> or
>>> "Net traffic gain/loss" with calculation of (net traffic /
>>> client_bytes)
>>> or
>>> "Hits as % of client traffic" with calculation of ( sum_hits /
>>> client_bytes )
>>>
>>> One guess which one we have today ...
>>
>> I think the bandwidth G/L formula should be something like:
>>
>> (client - server) / client
>>
>> Note how it is independent from the definition of what a "hit" is.
>
> This is my proposals point exactly.
>
>
> PS. in the above, "net traffic" == (client - server)

>> The name is a separate question.
>>
>> We could use a similar formula to report "hit ratio" as well (just use
>> message counts instead of bytes) but that would be somewhat against the
>> "standard practice". I am not against adding a separate "message G/L"
>> line for that.
>
> Okay. So if I get this you are in favour of only a text change.

Depends on what the current formula is. If it is (client-server)/client,
I would leave the formula as is and, if you wish, change the label. If
it is something else, then we can start by adding a new line with a
better label/formula and then possibly, eventually remove the old line
with a worse label/formula.

Hope this clarifies,

Alex.
Received on Thu Feb 09 2012 - 01:54:47 MST

This archive was generated by hypermail 2.2.0 : Thu Feb 09 2012 - 12:00:05 MST