Re: [squid-users] flat file parsing vs db filter rules parsing

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 31 Oct 2004 15:39:11 +0100 (CET)

On Sat, 30 Oct 2004, Muthukumar wrote:

> We are having two way of processing to get filter rules from DB as,
>
> 1. strtokFile() <-- reads filter rules from DB file. (acl test urlpath_regex -i "/etc/database.db")
> Processed filter rules are stored in system memory with splay tree's data sturcture / linked structures.
> Reads stored filter rules from system memory and process client requests.

Which is not much different from simply dumping the DB contents into flat
files and then ask Squid to read it's configuration.

> 2. strtokFile() <--- reads filter rules from FLAT file (/etc/urlsites) ( acl test urlpath_regex -i "/etc/urlsites" )
> Processed filter rules are then moved into a database as CONTIGUOUS manner ( Marshelling on BDB ).
> Reads stored filter rules from DB, store into the system memory (UnMarshelling on BDB)
> Then process every client requests.
>
> It requires, Automative updation of FLAT files so that DB changes will be modified based on it.
> Reconfiguration of squid will keep new changes.
>
> In this, which design will give performance differance.

I think you need to take a pause and look at what problem is it you want
to solve?

The problem of parsing is inherently different from the problem of
performing an lookup.

The data structure to use for lookups is very much dependent on the type
of acl you look at.

As indicated above there is very little to gain from seeing the DB as just
"storage of a plain file of acl data" which is read entirely into memory
by Squid, as this is almost the same as simply dumping the DB content into
flat files and then ask Squid to read these flat files as just plain flat
files (i.e. no changes needed to Squid, only in your system management).

The true benefits arise when having DB driven lookups where the lookup as
such utilizes the indexing capabilities of the DB in question. There is
two main benefits of such design

   a) Parsing time is reduced and moved out of Squid. All parsing is done
when updating the DB.

   b) Dynamic updates. Updates to the DB content is immediately reflected
by Squid, eleminating the need to reconfigure Squid to pick up the new acl
data.

The drawback is slightly reduced performance in some cases. It is much
easier to optimize lookups in a memory structure than in a DB.

If you want a reasonable defined goal then I would look into how the
SquidGuard ACLs could be implemented within Squid. These ACLs are pretty
well defined, including DB structure design suitable for their lookup
needs.

Regards
Henrik
Received on Sun Oct 31 2004 - 07:39:17 MST

This archive was generated by hypermail pre-2.1.9 : Mon Nov 01 2004 - 12:00:02 MST