Re: [PATCH] variant Key negotiation on error pages

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 21 Mar 2013 21:38:37 +1300

On 21/03/2013 5:45 p.m., Alex Rousskov wrote:
> On 03/20/2013 09:02 PM, Amos Jeffries wrote:
>
>> + rep->header.putStr(HDR_VARY,tmp.termedBuf());
>> +
>> + // Key:Accept-Language;b="foo"
>> + // Only supply the match parameters if the language is actually found.
>> + // On the generic reply to all 'unknown' inputs we must be vague like Vary
>> + // which makes the recipient use the entire Accept-Language as the variant key.
>> + if (err_language) {
>> + tmp.append(";b=\"");
>> + tmp.append(err_language);
>> + tmp.append('"');
>> + }
>> + rep->header.delById(HDR_KEY);
>> + rep->header.putStr(HDR_KEY, tmp.termedBuf());
>
> If I am interpreting draft-fielding-http-key-02 correctly, there is no
> point in sending
>
> Vary: Accept-Language
> Key: Accept-Language
>
> because both of the above header fields have identical semantics
> (Section 2). Did I misunderstood? If I did not, you can move the
> delById() and putStr() calls inside the err_language if guard (and merge
> the two identical if-guards, one after another?).

"

   When a cache
    fully implements this mechanism, it MAY ignore the Vary response
    header field.
"

Which does not limit the ignoring of Vary to just Vary+Key responses. So
until (and if) the spec gets updated to mandate following Vary whenever
Key is absent its best to always emit Key+Vary for the same reasons that
it is best to always emit Vary if *any* of the URLs responses needs it.
Otherwise we risk causing for Key the same problem that Apache caused
for Vary - one response coming back with *no* key gravitating all future
traffic to that variant regardless of others existence.

>
>>> If the negotiated language was "xfo" and the later requested language
>>> happens to be "xfoobar", will the "xfo" language response be served to
>>> an "xfoobar" reader because of the "prefix match" logic of the b=".."
>>> modifier? Is that a good thing?
>> Yes and yes. Due to the way the codes are syntaxed the valid ones are
>> all 2 or 5 bytes long with an optional trailer.
> Well, many languages have three letters in the primary tag (e.g., duh
> for Dungra Bhil) and other length are allowed, but what is important
> here is whether a language tag may be a prefix of another, completely
> unrelated language tag. If yes, then using the prefix b="..." modifier
> would be wrong IMHO.
>
> For example, should a cached English ("en") error page be served to an
> Emumu-speaking client?

Hmm. Yes I see they extended the tags beyond ISO-639-1 set now.

The way Squid scans to find and fill err_language the longer codes will
be added in preference over shorter ones. We only come up to problems
when the full language code is 2-bytes versus being a prefix of a 3-byte
one.

If we send ;b= the 2-bytes will match against 3-bytes codes. But if we
send ;p= then 2-byte will fail to match against the more common 5-byte
wildcards. For example env vs en-*.

>
>> Type: language
>> Subtag: enr
>> Description: Emumu
>> Description: Emem
>> Added: 2009-07-29
>> Type: language
>> Subtag: en
>> Description: English
>> Added: 2005-10-16
>> Suppress-Script: Latn
>
> I could not find any rules that prohibit one language tag to be a prefix
> of another. It is possible that I missed them (not my area of
> expertise), but the above counter-example with Emumu/English seems to
> indicate that there is no such prohibition?

In practice from my archive of Accept-Languge headers collected from
around the net over the last few years there are no 3-byte codes in use
(so far).

So the question is do we want to cater to the rare case by compromising
a more common form of breakage? I think no.

Amos
Received on Thu Mar 21 2013 - 08:38:45 MDT

This archive was generated by hypermail 2.2.0 : Thu Mar 21 2013 - 12:00:08 MDT