Re: [PATCH] Unknown cfg function

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 31 Jul 2013 10:59:00 -0600

Christos, Amos,

    Thank you for working on this! I hope we can avoid repeating the
same set of mistakes thrice by carefully considering the big picture and
upgrade path. Here is my understanding of how we want things to work,
based on the review of the reported bugs and this thread so far:

1. configuration_includes_quoted_values defaults to off. The setting
scope extends from the place where the directive is used to either the
end of configuration or to the next use, whichever comes first.

2. When configuration_includes_quoted_values is on, new "strict syntax"
rules are enforced:

2a. "quoted values" and function()s are supported. %Macros are supported
inside quoted values and only inside them. Unknown functions and macros
terminate Squid. Code uses NextToken() to get tokens by default. A small
subset of hand-picked directives may use NextQuotedOrToEol() or other
special methods instead of NextToken() to accommodate more legacy cases.
Logformat is one such example.

2b. A % sign can be \-escaped to block macro expansion in quoted strings
where needed. A few "standard" escape sequences are supported such as
\n. Unknown escape sequences terminate Squid.

2c. By default, token delimiter is whitespace. Bare (i.e., unquoted)
tokens containing any character other than alphanumeric, underscore,
period, '(', ')', '=', and '-' terminate Squid. For example, foo{bar},
foo_at_bar, and foo"bar" terminate Squid in strict syntax mode.

3. When configuration_includes_quoted_values is off, old "legacy syntax"
is supported, to the extent possible:

3a. "quoted values", functions(), and %macros are not recognized or
treated specially compared to Squid v3.3. Code uses NextToken() to get
tokens by default. A small subset of hand-picked directives may use
NextQuotedOrToEol() instead of NextToken() but that method does _not_
treat quoted strings specially in legacy mode. Logformat is one such
example.

3b. I am not sure about "file.cfg" include syntax in legacy mode. I
think it would be OK _not_ to support it (because fixing configurations
to use new parameters() syntax should not be very difficult in most
cases), but I may be wrong. If it is easy to continue to support
"file.cfg" include style in NextToken() working in legacy mode, then we
should do it.

4. Exceptions: A small set of directives that already supported quoted
strings correctly before the introduction of
configuration_includes_quoted_values must continue to support them.
These directives should call NextQuotedOrLegacy() method. The method
temporary forces configuration_includes_quoted_values to ON if and only
if the next token starts with a quote.

This may create a few upgrade problems when, for example, somebody is
using unsupported %macros inside quoted strings with these options, but
the support for quoted strings in those headers was added relatively
recently so there should not be many such cases, and it is not practical
to treat these options as a complex third class. They have to obey all
strict syntax rules when they use quoted strings (and also when
configuration_includes_quoted_values is on, of course).

5. Future changes:

5a. We revisit handling of 'single quoted strings' later, after the
above is done. They are useful to disable macros and standard escape
sequences in strings, but they are not a "must have" for now, and we
need to decide how to treat \-escapes inside them. We will probably
allow escaping of two characters only: \ and '.

5b. We revisit handling of REs, after the above is done. We will
probably add a proper dedicated /quoting mechanism/ for them. For now,
most REs require configuration_includes_quoted_values set to off.

5c. I recommend combining 5a and 5b implementation to support a simple
generic quoting mechanism with an admin-selectable quoting character.
That is almost a must for REs but also help with general quoting of
complex expressions. See "Quote and Quote-like Operators" in Perl
(perlop man page) for ideas (but we only need a few lines from that
table). Note how 2c facilitates this future change by immediately
prohibiting unquoted tokens that may become quoted strings later. For
example q{foobar} is prohibited if configuration_includes_quoted_values
is on.

Anything I missed or misrepresented?

Thank you,

Alex.
Received on Wed Jul 31 2013 - 16:59:19 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 31 2013 - 12:00:07 MDT