Re: your suggestion for range_offset_limit

From: Adrian Chadd <adrian_at_squid-cache.org>
Date: Thu, 26 Nov 2009 11:56:55 -0500

the trick at least in squid-2 is to make sure that quick abort isn't
occuring. Or it will begin downloading the whole object, return the
requested range bit, and then abort the remainder of the fetch.

Adrian

2009/11/25 Amos Jeffries <squid3_at_treenet.co.nz>:
> Matthew Morgan wrote:
>>
>> On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries <squid3_at_treenet.co.nz>
>> wrote:
>>>
>>> Matthew Morgan wrote:
>>>>
>>>> Sorry it's taking me so long to get this done, but I do have a question.
>>>>
>>>> You suggested making getRangeOffsetLimit a member of HttpReply.  There
>>>> are
>>>> two places where this method currently needs to be called: one is
>>>> CheckQuickAbort2() in store_client.cc.  This one will be easy, as I can
>>>> just
>>>> do entry->getReply()->getRangeOffsetLimit().
>>>>
>>>> The other is HttpStateData::decideIfWeDoRanges in http.cc.  Here, all we
>>>> have access to is an HttpRequest object.  I looked through the source to
>>>> see
>>>> if I could find where a request owned or had access to a reply, but I
>>>> don't
>>>> see anything like that.  If getRangeOffsetLimit were a member of
>>>> HttpReply,
>>>> what do you suggest doing here?  I could make a static version of the
>>>> method, but that wouldn't allow caching the result.
>>>
>>> Ah. I see. Quite right.
>>>
>>> After a bit more though I find my original request a bit weird.
>>>
>>> Yes it should be a _Request_ member and do its caching there. You can go
>>> ahead with that now while we discuss whether to do a slight tweak on top
>>> of
>>> the basic feature.
>>>
>>>
>>> [cc'ing squid-dev so others can provide input]
>>>
>>> I'm not certain of the behavior we want here if we do open the ACLs to
>>> reply
>>> details. Some discussion is in order.
>>>
>>> Simple way would be to not cache the lookup the first time when reply
>>> details are not provided.
>>>
>>> It would mean making it return potentially two different values across
>>> the
>>> transaction.
>>>
>>>  1) based on only request detail to
>>>  and other on request+reply details. decide if a range request to
>>> possible.
>>> and then
>>> 2) based on additional reply details to see if the abort could be done.
>>>
>>> No problem if the reply details cause an increase in the limit. But if
>>> they
>>> restrict it we enter grounds of potentially making a request then
>>> canceling
>>> it and being unable to store the results.
>>>
>>>
>>> Or, taking the maximum of the two across two calls? so it can only
>>> increase.
>>>  would be slightly trickier involving a flag a well to short-circuit the
>>> reply lookups instead of just a magic cache value.
>>>
>>> Am I seriously over-thinking things today?
>>>
>>>
>>> Amos
>>
>> Here's a question, too: is this feature going to benefit anyone?  I
>> realized later that it will not solve my problem, because all the
>> traffic that was getting force downloaded ended up being from windows
>> updates.  The urls showing up in netstat and such were just weird
>> because the windows update traffic was actually coming from limelight.
>>  My ultimate solution was to write a script that reads access.log,
>> checks for windows update urls that are not cached, and manually
>> download them one at a time after hours.
>>
>> If there is anyone at all who would benefit from this I would still be
>> *more* than glad to code it (as I said, it would be my first real open
>> source contribution...very exciting), but I just wondered if anyone
>> will actually use it.
>
> I believe people will find more control here useful.
>
> Windows update service packs are a big reason, but there are also similar
> range issues with Adobe Reader online PDFs, google maps/earth, and flash
> videos when paused/resumed. Potentially other stuff, but I have not heard of
> problems.
>
> This will allow anyone to fine tune the places where ranges are permitted or
> forced to fully cache. Avoiding the problems a blanket limit adds.
>
>>
>> As to which approach would be better, I don't know enough about that
>> data path to really suggest.  When I initially made my changes, I just
>> replaced each reference to Config.range_offset_limit or whatever.
>> Today I went back and read some more of the code, but I'm still
>> figuring it out.  How often would the limit change based on the
>> request vs. the reply?
>
> Just the once. On first time being checked for the reply.
> And most likely on the case of testing for a reply mime type. The other
> useful info I can think of are all request data.
>
> You can ignore if you like. I'm just worrying over a borderline case.
> Someone else can code a fix if they find it a problem or need to do mime
> checks.
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20
>  Current Beta Squid 3.1.0.15
>
>
Received on Thu Nov 26 2009 - 16:57:03 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 27 2009 - 12:00:06 MST