Re: [squid-users] New user - few questions

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 24 Nov 2011 14:20:57 +1300

 On Wed, 23 Nov 2011 16:38:06 +0000, Sw_at_g wrote:
> Hi Again,
>
> I had a further look at it, and understand a bit more now. What I
> would like to achieve is to have the access log looking as below, and
> only registering individual web page accessed rather than every
> object
> (within the page requested by the user)

 (sorry. Rant warning.)

 There is no "page". HTTP simply does not have any such concept. This is
 where it differs from FTP and Gopher protocols of the 1980's. They have
 easily identifiable URL which can be considered "pages". In HTTP
 everything is an "object" and any type of object can appear at any URL,
 at any time, as negotiated between the client and server.
  For dynamic websites things get extremely tricky. The HTML part most
 developers think of as the "page", may be a collection of snippets
 spread over many requests when Squid sees it. Sometimes a script or
 media object creates the HTML after arrival and essentially *is* the
 page (think Shockwave Flash "web pages" that were the popular fad a few
 years back).

 A "web page" is a data model which only exists inside web browsers and
 users headers these days. Although browser people call it "the DOM"
 instead of "page" (reflecting what it actually is; a model made up of
 many data objects).

 /rant

 The best you can hope for in HTTP is to log the "Referer:" request
 header. And hope that all clients will actually send it (the privacy
 obsessed people wont, nor will a lot of automated agents). Even then you
 will face a little skew since Referer: contains the *last* page that was
 requested. Not the current one. It works only because most pages have
 sub-objects whose requests might contain a Referer: during the actual
 page load+display time.

 The logformat documentation describes the tokens available. access_log
 documentation describes how to use logformat, with some examples at the
 bottom of the page, and also describes how to use ACL tests to log only
 specific things to one particular file.

 Amos

>
> day:month:year-hour:minute:second
> url_of_the_page_requested_by_the_user
>
> Looking forward for your reply,
>
> Kind regards,
>
>
> On 22/11/11 22:13, Amos Jeffries wrote:
>> On Tue, 22 Nov 2011 18:11:26 +0000, Sw_at_g wrote:
>>> Hi all,
>>>
>>> First of all, I would like to thank you for your time and effort
>>> for
>>> providing such a great tool.
>>>
>>> I am a new user on archlinux, using Squid locally. I have a few
>>> questions, regarding the setup most of all.
>>>
>>> - Is it possible to change the information logged into access.log?
>>> I
>>> would like it like that
>>>
>>> => date +%#F_%T address_visited (I would like to replace the
>>> timestamps with a human readable time/date and just the website
>>> visited)
>>
>> http://wiki.squid-cache.org/SquidFaq/SquidLogs
>> http://www.squid-cache.org/Doc/config/access_log
>> http://www.squid-cache.org/Doc/config/logformat
>>
>>>
>>> => Is it possible to limit the size of the logs from within the
>>> squid.conf file?
>>>
>>
>> No. You need to integrate log management tools like logrotate.d or
>> cron jobs to control when log rotation occurs.
>>
>>
>>> And the last question, I have that "error" coming up from the
>>> cache.log
>>>
>>> IpIntercept.cc(137) NetfilterInterception: NF
>>> getsockopt(SO_ORIGINAL_DST) failed on FD 29: (92) Protocol not
>>> available
>>>
>>> And the browsing become really slow, even page aren't opening
>>> anymore? Any advice?
>>
>> Squid is unable to locate the client details in the kernel NAT
>> table. NAT *must* be done on the Squid box.
>>
>> Also ensure that you have separate http_port lines for the different
>> types of traffic arriving at your Squid.
>>
>> Amos
Received on Thu Nov 24 2011 - 01:21:04 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 24 2011 - 12:00:03 MST