[squid-users] An example to squid cache affecting user-agents(Firefox,Chrome,wget\curl)

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Thu, 11 Jul 2013 14:40:46 +0300

I have been testing quite some time some urls for cachability.
It seems like there are different methods to request the same file which
leads to different reaction in squid and I want to make sure 100% what
is the cause to the *problem* before I am running to a conclusion since
I am not 100% sure.
Please take your *free* time to read it and see if there is something I
probably missed with hope to understand the issue in hands.

Thanks Ahead,
Eliezer

I have tried to use wget\curl or firefox and chrome which gave me
another reaction from squid and I want to make sure what the cause for it.
using simple wget of two requests I am getting:
1373541195.850 743 192.168.10.124 TCP_MISS/200 85865 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_DIRECT/88.221.156.163 image/jpeg
1373541220.437 4 192.168.10.124 TCP_MEM_HIT/200 85737 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_NONE/- image/jpeg

which is a success caching\HIT.
in this request the headers are:
---------
GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-11-728.jpg?1329866994
HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: image.slidesharecdn.com
Connection: Close
Proxy-Connection: Keep-Alive

----------
The response is:
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive

----------
and on the second time
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
Age: 347
X-Cache: HIT from www1.home
X-Cache-Lookup: HIT from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive

----------
Which makes it a HIT.
Then there is nothing that seems wrong to the application server way of
doing things and basic squid internals.
While on chrome and firefox there is something different in the request:
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: max-age=4794000
Connection: keep-alive

----------
that results in a respond:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:26:53 GMT
Content-Length: 48511
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive

---------

while the next chrome request treated as a 304:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Connection: keep-alive
Vary: Accept-Encoding

----------
<...>
HTTP Client REPLY:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Vary: Accept-Encoding
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive

----------
So the application server responds to the 304.. and not squid since
squid is obligated to respond with a valid http response.
So chrome verifies that his local cache is valid and its fine.

The next scenario is when chrome force a no-cache in the Cache-Control
header.
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Pragma: no-cache
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: no-cache
Connection: keep-alive

----------
<...>
HTTP Server REPLY:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48511
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:34:03 GMT
Connection: keep-alive

�
----------
I am not sure but I want to debug this issue if there is one.
The request should have been served from cache since the refresh_pattern
is pretty explicit about it.
I know how to read the headers and what is suppose to be but I am a bit
confused and unable to reach the right conclusion to the root of why
squid will treat the wget request differently from chrome requests.
Any new point of view will help me.

Thanks Again,
Eliezer
Received on Thu Jul 11 2013 - 11:41:21 MDT

This archive was generated by hypermail 2.2.0 : Thu Jul 11 2013 - 12:00:24 MDT