Re: [MERGE] Initial netfilter mark patch for comment

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 07 Aug 2010 23:39:19 +1200

Andrew Beverley wrote:
> On Mon, 2010-08-02 at 12:03 -0600, Alex Rousskov wrote:
>> On 08/01/2010 05:47 PM, Andrew Beverley wrote:
>>> Please find attached the first version of the netfilter mark patch. I've
>>> not yet tested it extensively, but would welcome some initial feedback
>>> or comments.
>
> Thanks for the prompt reply. Please find attached an updated version of
> the patch, which includes fixes to all the feedback below. It remains
> not-extensively tested, but is attached for further comments.
>
>> * It would be nice to get a proposed commit message describing the
>> changes as a patch preamble. Among other things, this will allow
>> reviewers to understand the overall scope and intent of your work.
>
> Sorry about that, this is new to me. As you've probably gathered, the
> aim of this patch is to implement the existing TOS functionality, but as
> netfilter marking.
>
>> * comm_set_mark() has a very generic name. Many things can be "marked",
>> for many reasons. Can you come up with a better, more specific one?
>
> Renamed to comm_set_nfmark.
>
>> * comm_set_mark() is not documented but is a part of the public Comm API.
>>
>
> Now documented in the source code. Is that the only place to document?
>
>> * getNfctMark() definition #ifdef conditions are inconsistent with its
>> declaration and use #ifdefs.
>
> Fixed.
>
>> * getNfctMark() is static but does not start with a capital letter.
>
> Fixed.
>
>> * getNfctMark() might belong to fde rather than FwdState because it does
>> not use FwdState. I am not sure about this one, but it may indicate a
>> general design problem -- a callback with no connection to the caller.
>
> I thought about moving to fde, but I feel it sits better in FwdState as
> although it is not called directly within it, it is called as a result
> of other code in there.
>
> getNfctMark() has been renamed to GetNfMarkCallback
>
>> * Please document who calls getNfctMark() and when.
>>
>
> Comments added.
>
>> * If getNfctMark() is an async callback that will be called some time
>> after the nfct_callback_register returns, how do you know that clientFde
>> is still a valid pointer _and_ is pointing to the same connection
>> information?
>>
>> * If getNfctMark() is an async callback that will be called at some
>> random time after the nfct_callback_register returns, is it safe to use
>> debugs()?
>
> I'm pretty sure it's synchronous: if I add a 3 second delay in the
> callback, then the rest of the code isn't executed until the callback
> has finished, and the conntrack information is still found.
>
>> * Please use Doxygen /** or /// comments for method descriptions.
>>
>
> Fixed.
>
>> * Please declare local variables when you first use them, if possible.
>> For example,
>
> Fixed.
>
>> * Please add a comment why ct = nfct_new() is not leaking memory despite
>> no matching delete/free OR fix the leak.
>
> nfct_destory() added. Thanks for pointing this out.
>
>> * upstreamMark member documentation should be fixed. What does that
>> member store/mean? I understand that you were copying the documentation
>> bug from upstreamTOS, but I hope we do not have to replicate that.
>
> upstreamMark and upstreamTOS fixed.
>
>> * Please break out new code into a new FwdState method(s) instead of
>> making FwdState::dispatch longer and uglier.
>
> I have added 2 new methods: getUpstreamTOS() and getUpstreamNfMark()
>
>> * Same comment applies to src/client_side_reply.cc code, including the
>> old QOS code already there. It should be extracted from regular methods
>> as they are getting difficult to follow, especially with all the ifdefs.
>
> I have added tosLocalMissValue() and nfmarkLocalMissValue()
>
>> * The new code implements a non-critical feature because errors do not
>> terminate transactions. Yet, most errors are reported at level 1 to
>> cache.log. We often have to modify the code to remove excessive
>> cache.log pollution because it hurts busy proxies. Do you need to use
>> level-1 reporting? Of every error? Or perhaps the code should just
>> increment some stats counters?
>
> I have moved most of the general errors to level 2.
>
>> * Is there a way to defer most support checks to runtime (like
>> comm_set_mark does it), to minimize the use of #ifdefs in the core code?
>> For example, can we use one #ifdef variable for both USE_QOS_NFMARK and
>> USE_ZPH_QOS code, in most contexts? They are intermixed in the code in a
>> complex ways that I find difficult to follow.
>
> I have simplified this and removed as many as possible. I have added
> isTosActive() and isMarkActive() as suggested by Amos, and used these
> instead. Some of the code relies on the libnetfilter_conntrack
> libraries, so I have had to keep that inside #ifdefs.
>
> This leads me on to my first question: why not just remove USE_ZPH_QOS
> and the compilation option --enable-zph-qos? With the attached patch,
> the code only runs when specified in squid.conf, and it has no other
> dependencies. The ZPH kernel patch part can remain in the _SQUID_LINUX
> tests.
>
>> * The USE_QOS_NFMARK and USE_ZPH_QOS naming seems inconsistent.
>>
>
> I have renamed USE_ZPH_QOS to USE_QOS_TOS. However, see my question
> above.
>
>> The above review does not answer your questions below, and I am sorry
>> for that. I hope Amos or others do better. I agree with many Amos'
>> comments, and I especially encourage you to reduce and simplify #ifs and
>> #ifdefs if possible, following Amos' hints.
>
> Hopefully the code is a lot clearer now, especially with the #defs
> removed. Replies to Amos' comments will follow shortly.
>
>>> - The existing TOS patch cannot be disabled at runtime. As such, this
>>> mark patch cannot be either. Would it be preferable to only enable them
>>> both if the qos_flows config option is present? This would also have the
>>> advantage of being able to add CAP_NET_ADMIN as appropriate at runtime.
>
> They are now both enabled at runtime.
>
>>> - I have used sscanf instead of strtoul for both TOS and MARK in
>>> QosConfig.cc (sscanf doesn't auto-detect the format of unsigned long
>>> int). As a result, the tos variable could be changed to type char, which
>>> is what it should be in my opinion. Should I do this?
>
> If we can move to strtoul, I would like to change 'tos' to char
> throughout. Currently it is possible to set it to invalid values in
> squid.conf, which then cause problems with dumpConfigLine.
>
> Question number 2: what is stubQosConfig.cc? Does that also need
> updating for this patch?
>

stub* are cut down set of all non-inline Ip::QosConfig methods and any
globals defined in QosConfig.h. Changes to the API need to be mirrored
there. The functions inside usually call fatal() to alert a wrong
linkage clearly during testing. In this particular case the parse
function will need to be duplicated there.

I have failed to find any problems with strtoul. Looks like it can be
used. Although we may have to find a GPLv2 compatible implementation.

In this version the new methods look they should be in the Ip::Qos
namespace.

  * the new clientReplyContext::tosLocalMissValue() and
clientReplyContext::nfmarkLocalMissValue() methods particularly look
like they really should take the hier code as a second parameter. Both
fd and the hier if passed in should be const.

  * the new FwdState getUpstream* methods are in a similar position but
not quite so clear cut. If you can find way to cleanly move them there
great, otherwise it's not so much a bit deal yet.

* Thank you for the extra documentation on comm_set_tos().

* Please remove the bits touching comm_set_v6only() its 'tos' field is
not related to QoS. Just acronym confusion.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.6
   Beta testers wanted for 3.2.0.1
Received on Sat Aug 07 2010 - 11:39:32 MDT

This archive was generated by hypermail 2.2.0 : Thu Aug 12 2010 - 12:00:05 MDT