5

I have finalized a small PHP application that can serve many documents. These documents must be cacheable by clients and proxies.

Since proxies can cache my results I must be extra careful because the documents I serve can have different MIMEs types (content negotiation based on $_SERVER['HTTP_ACCEPT']) and different languages (based in this order: $_POST value / $_GET value / URL / PHP session value / $_COOKIE value / $_SERVER['HTTP_ACCEPT_LANGUAGE'] / default script value).

To shortly sum up, a page can be served with many MIME type and many languages with the same URL (question changed: see edit below).

To help cache on proxies I use the "Vary: Accept" header in combination with the ETag header. The ETags is a MD5 of the current language and the last modified timestamp.

I always:

  • Send an Expires header
  • Send a Cache-Control header
  • Send a Last-Modified header
  • Send a Content-Type header
  • Send an ETag header (based on current language and Last-Modified timestamp)
  • Send a Content-Language
  • Send a "Vary: Accept" header if the document is XHTML

Now with my question: is this enough to help cache on proxies and clients? Did I miss a thing/header?

To help you, here’s the HTTP response header for a test page (on my local environment):

"
Date             Wed, 30 Dec 2009 18:56:26 GMT
Server           Apache/2.0.63 (Win32) PHP/5.1.0
X-Powered-By     PHP/5.1.0
Set-Cookie       Tests=697daqbmple2e1daq2dg74ur96; path=/
Expires          Wed, 30 Dec 2009 21:56:26 GMT
Cache-Control    public, max-age=10800
Last-Modified    Mon, 28 Dec 2009 15:11:49 GMT
Etag             "44fa50be4638161a596e4b75d6ab7a94"
Vary             Accept
Content-Language en-us
Content-Length   3043
Keep-Alive       timeout=15, max=100
Connection       Keep-Alive
Content-Type     application/xhtml+xml; charset=UTF-8
"

EDIT: OK I understand that in this case serving a document with many MIMEs and having different languages (that can come from so many sources - see above) is just plain bad design. If you want to do this just use "private" cache (no cache on proxies)... Am I correct?

If each language have it's own URL (but each URL can be served with many MIME still) is my current implementation is OK for a "public" cache (cache on clients + proxies)?

3 Answers 3

3

Since your output also depends on things a proxy cannot know like session data, won't it be easier to send a (non-cachable) redirect to the actual content, which would be fixed for a given URL (with parameters) and therefore much easier to cache. I know this involves an extra round-trip, but it's probably much less error-prone and would also cause less problems with proxies that don't completely understand/support all your header combinations.

Also, I'm guessing that, if you have two clients going through the same proxy but with different language cookies, your current method would return two different ETags for the same URL, which would make the proxy update its copy each time it sees the other client.

Sign up to request clarification or add additional context in comments.

6 Comments

The session/cookie only contain the language and this is sent in the ETag AND Content-Language headers...
But how will a proxy know which language to serve to its client?
I understand more now... How would you implement the non-cachable redirect in PHP?
As you would normally, just something like header("Location: /document?lang={$_COOKIES['lang']}&format={$_POST['format']}") (with proper urlencoding, etc.), it will be non-cachable by default. Or if you control the document that's generating the HTML which contains the link to /document, just generate the correct URL-with-parameters the first time and you don't even need the redirect.
Thanks for you help. In the case that each language have it's own URL, does my headers are appropriate if pages can be cached by proxies and clients (public)? See edit on original question.
|
1

I believe you should be fine in principle -- adding the Vary header means that caches should hold multiple instances of your data, keyed by ETag.

I would note, though, that you don't only vary on Accept, you also vary on Cookie and Accept-Language. Varying by cookie means that the proxy will have to validate every request, but should be able to use an If-None-Match header to let the server indicate which (already cached) ETag should be used.

2 Comments

I think you are right. What's the syntax of the Vary header when you have many conditions like this?
Look at w3.org/Protocols/rfc2616/rfc2616-sec14.html section 44. Re-reading your comments, I think you may have to Vary: *, as the language may change inside a session without the cookies (or any other headers or URL components) changing.
0

If the response varies both on "Accept" and "Accept-Language", then both need to be mentioned in the "Vary" response header.

2 Comments

It's based on the current language which is (cascade) set by (in this order): $_POST value / $_GET value / URL / PHP session value / $_COOKIE value / $_SERVER['HTTP_ACCEPT_LANGUAGE'] / default script value
If it varies on data other than in the request headers, you'll need to state "Vary: *", or take it out and set Cache-Control accordingly. Otherwise intermediaries will be confused.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.