文档章节

Two Simple Rules for HTTP Caching

adamsun
 adamsun
发布于 2013/02/04 15:33
字数 974
阅读 11
收藏 0

In practice, you only need two settings to optimize caching:

  1. Don’t cache HTML
  2. Cache everything else forever

“Wooah…hang on!”, we hear you say. “Cache all my scripts and images forever?

Yes, that’s right. You don’t need anything else in between. Caching indefinitely is fine as long as you don’t allow your HTML to be cached.

“But what about if I need to issue code patches to my JavaScript? I can’t allow browsers to hold on to all my images either. I often need to update those as well.”

Simple – just change the URL of the item in your HTML and it will bypass the existing entry in the cache.

In practice, caching ‘forever’ typically means setting an Expires header value of Sun, 17-Jan-2038 19:14:07 GMT since that’s the maximum value supported by the 32 bit Unix time/date format. If you’re using IIS6 you’ll find that the UI won’t allow anything beyond 31-Dec-2035. The advantage of setting long expiry dates is that the content can be read from the local browser cache whenever the user revisits the web page or goes to another page that uses the same images, script or CSS files.

You’ll see long expiry dates like this if you look at a Google web page with HttpWatch. For example, here are the response headers used for the main Google logo on the home page:

Google Expires header

If Google needs to change the logo for a special occasion like Halloween they just change the name of the file in the page’s HTML to something like halloween2007.gif.

The diagram below shows how a JavaScript file is loaded into the browser cache on the first visit to a web page:

Accessing page with empty cache

On any subsequent visits the browser only has to fetch the page’s HTML:

Read from cache

The JavaScript file can be read directly from the browser cache on the user’s hard disk. This avoids a network round trip and is typically 100 to 1000 times faster than downloading the file over a broadband connection.

The key to this caching scheme is to keep tight control over your HTML as it holds the references to everything else on your web site. One way to do this is to ensure that your pages have a Cache-Control: no-cache header. This will prevent any caching of the HTML and will ensure the browser requests the page’s HTML every time.

If you do this, you can update any content on the page just by changing the URL that refers to it in the HTML. The old version will still be in the browser’s cache, but the updated version will be downloaded because of the modified URL.

For instance, if you had a file called topMenu.js and you fixed some bugs in it, you might rename the file topMenu-v2.js to force it to be downloaded:

Force update with new file name

Now this is all very well, but whenever there’s a discussion of longer expiration times, the marketing people get very twitchy and concerned that they won’t be able to re-brand a site if stylesheets and images are cached for long periods of time.

In fact, choosing an expiration time of anything other than zero or infinite is inherently uncertain. The only way to know exactly when you can release a new version to all users simultaneously is to choose a specific time of day for your cache expiry; say midnight. It’s better to set indefinite caching on all your page-linked items so that you get the maximum amount of caching, and then force updates as required.

Now, by this point, you might have the marketing types on board but you’ll be losing the developers. The developers by now are seeing all the extra work involved in changing the filenames of all their CSS, javascript and images both in their source controlled projects and in their deployment scripts.

So here’s the icing on the cake; you don’t actually need to change the filename, just the URL. A simple way to do this is to append a query string parameter onto the end of the existing URL when the resource has changed.

Here’s the previous example that updated a JavaScript file. The difference this time is that it uses a query string parameter ‘v2′ to bypass the existing cache entry:

Force update with query string

The web server will simply ignore the query string parameter unless you have chosen to do anything with it programmatically.

There’s one final optimization you can make. The Cache-Control: no-cache response header works well for dynamic pages as it ensures that pages will always be refreshed from the server; even when pressing the Back button. However, for HTML that changes less frequently it is better to use the Last-Modified header instead. This will avoid a complete download of the page’s HTML, if it has not changed since it was last cached by the browser.

The Last-Modified header is added automatically by IIS for static HTML files and can be added programmatically in dynamic pages (e.g. ASPX and PHP). When this header is present, the browser will revalidate the local, cached copy of an HTML page in each new browser session. If the page is unchanged the web server returns a 304 Not Modified response indicating the browser can use the cached version of the page.

So to summarize:

  1. Don’t cache HTML
    • Use Cache-Control: no-cache for dynamic HTML pages
    • Use the Last-Modified header with the current file time for static HTML
  2. Cache everything else forever
    • For all other file types set an Expires header to the maximum future date your web server will allow
  3. Modify URLs by appending a query string in your HTML to any page element you wish to ‘expire’ immediately.

本文转载自:http://blog.httpwatch.com/2007/12/10/two-simple-rules-for-http-caching/

adamsun
粉丝 2
博文 5
码字总数 17
作品 0
朝阳
程序员
私信 提问
Firefox与IE浏览器缓存的两个重要区别

简介 详细介绍了firefox的缓存与IE的区别,如何设置服务响应头让两者缓存行为一致. 当你建立好一个WEB服务后,通常有两个类型的缓存需要配置: 设置网站有更新的时候html资源马上过期,以便正在浏...

adamsun
2013/02/04
201
0
HTTP头 Pragma:no-cache 缓存来源

[转]今天给同事分享了一下前端性能优化,在介绍了php文件缓存的方法后,发现一个AJAX请求的文件,在请求头中始终有一个:Pragma:no-cache,导致这个文件不能被浏览器缓存。接着发现这个站几乎...

jarly
2013/03/09
10.1K
0
yii_1_1_17_14-15(前台与伪静态与路由与缓存技术使用-2016-2-13)

1.隐藏单入口index.php (1).保证apache配置文件的http.conf里的LoadModule rewrite_modulemodules/mod_rewrite.sp 开启(去掉#脚本注释) (2).将相对应目录的AllowOverride改为ALL (3).在根目录...

wsy940822
2016/02/13
37
0
在symfony2项目中100%提升doctrine的性能

Doctrine 2 has a full chapter devoted to caching but up until now we had never taken a look to it at ulabox. The thing is that the chapter does not only speak about caching SQL ......

mot_evans
2014/04/06
950
0
Cookieless cookies

There is another obscure way of tracking users without using cookies or even Javascript. It has already been used by numerous websites but few people know of it. This page expla......

Ryan-瑞恩
2013/08/19
170
0

没有更多内容

加载失败,请刷新页面

加载更多

一起来学Java8(五)——接口默认方法

Java8新加入一个特性,允许在接口方法中给定一个默认实现。前提是在方法前面加一个default关键字。 public interface InterfaceMethod {default void say() {System.out.println("hello...

猿敲月下码
25分钟前
8
0
weed3-2.3.3.查询之缓存控制

Weed3 一个超轻量级ORM框架(只有0.1Mb哦) 源码:https://github.com/noear/weed3 源码:https://gitee.com/noear/weed3 缓存控制,是查询中的重点 框架提供的是控制服务。而非缓存服务本身...

刘之西东
28分钟前
9
0
Java Web 中对 ServletRequest 的一些非常规操作解决方案

1. 前言 ServletRequest 是我们搞 Java Web 经常接触的 Servlet Api 。有些时候我们要经常对其进行一些操作。这里列举一些经常的难点操作。 2. 提取 body 中的数据 前后端交互我们会在 body...

码农小胖哥
今天
33
0
《Dual Encoding U-Net for Retinal Vessel Segmentation》阅读笔记-MICCAI2019

作者:Bo Wang1,2, Shuang Qiu2, and Huiguang He1,2,3 目的:Retinal Vessel Segmentation is an essential step for the early diagnosis of eye-related diseases, such as diabetes and ......

JungleKing
今天
33
0
一次看懂 Https 证书认证

TLS > 传输层安全性协定 TLS(Transport Layer Security),及其前身安全套接层 SSL(Secure Sockets Layer)是一种安全协议,目的是为网际网路通信,提供安全及数据完整性保障。 如图,TLS...

极客收藏夹
今天
36
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部