3ce3fe9 by Anon Ray at 2012-11-23 |
1 |
[HTML5 Boilerplate homepage](http://html5boilerplate.com) | [Documentation |
|
2 |
table of contents](README.md) |
|
3 |
|
|
4 |
# .htaccess |
|
5 |
|
|
6 |
In Apache HTTP server, `.htaccess` (hypertext access) is the configuration file |
|
7 |
that allows for web server configuration. HTML5 Boilerplate includes a number |
|
8 |
of best practice server rules for making web pages fast and secure, these rules |
|
9 |
can be applied by configuring `.htaccess` file. |
|
10 |
|
|
11 |
**You'll want to have these modules enabled for optimum performance:** |
|
12 |
|
|
13 |
* `mod_setenvif.c` (setenvif_module) |
|
14 |
* `mod_headers.c` (headers_module) |
|
15 |
* `mod_deflate.c` (deflate_module) |
|
16 |
* `mod_filter.c` (filter_module) |
|
17 |
* `mod_expires.c` (expires_module) |
|
18 |
* `mod_rewrite.c` (rewrite_module) |
|
19 |
|
|
20 |
|
|
21 |
## On Windows |
|
22 |
|
|
23 |
You've got a couple of options that depend on how you installed Apache. |
|
24 |
|
|
25 |
1. **WampServer**. This is by far the simplest option. If you have installed |
|
26 |
WampServer just click on the icon in the task bar, hover over the Apache |
|
27 |
section in the menu that comes up and then hover over the modules section. |
|
28 |
You will be presented with a list of modules. Simply click on a module name |
|
29 |
to enable it (or disable it if it is already enabled). A check mark next to |
|
30 |
a module indicates that it is enabled. WampServer will automatically restart |
|
31 |
the Apache service after you enable a module. |
|
32 |
|
|
33 |
2. **Manually editing `httpd.conf`**. This assumes that you have manually |
|
34 |
installed Apache. You will need to locate the `httpd.conf` file which is |
|
35 |
normally in the `conf` folder in the folder where you installed Apache (for |
|
36 |
example `C:\apache\conf\httpd.conf`). Open up this file in a text editor. Near |
|
37 |
the top (after a bunch of comments) you will see a long list of modules. Check |
|
38 |
to make sure that the modules listed above are not commented out. If they |
|
39 |
are, go ahead and uncomment them and restart Apache. |
|
40 |
|
|
41 |
That's it, you're done! |
|
42 |
|
|
43 |
|
|
44 |
## On Linux |
|
45 |
|
|
46 |
These instructions should work on any distribution where `apt-get` has been |
|
47 |
used to install Apache. |
|
48 |
|
|
49 |
1. Open up a terminal and type the following command. Enter your password when |
|
50 |
prompted. |
|
51 |
|
|
52 |
`sudo a2enmod setenvif headers deflate filter expires rewrite include` |
|
53 |
|
|
54 |
1. Restart apache by using the following command so the new configuration takes |
|
55 |
effect. |
|
56 |
|
|
57 |
`sudo /etc/init.d/apache2 restart` |
|
58 |
|
|
59 |
That's it, you're done! |
|
60 |
|
|
61 |
|
|
62 |
## On Mac |
|
63 |
|
|
64 |
Coming soon... |
|
65 |
|
|
66 |
|
|
67 |
## Security |
|
68 |
|
|
69 |
Do not turn off your ServerSignature (i.e., the `Server:` HTTP header). Serious |
|
70 |
attackers can use other kinds of fingerprinting methods to figure out the |
|
71 |
actual server and components running behind a port. Instead, as a site owner, |
|
72 |
you should keep track of what's listening on ports on hosts that you control. |
|
73 |
Run a periodic scanner to make sure nothing suspicious is running on a host you |
|
74 |
control, and use the ServerSignature to determine if this is the web server and |
|
75 |
version that you expect. |
|
76 |
|
|
77 |
|
|
78 |
## Performance |
|
79 |
|
|
80 |
### Configure ETags |
|
81 |
|
|
82 |
```apache |
|
83 |
FileETag None |
|
84 |
``` |
|
85 |
|
|
86 |
Entity tags (ETags) is a mechanism that web servers and browsers use to |
|
87 |
determine whether the component in the browser's cache matches the one on the |
|
88 |
origin server. (An "entity" is another word a "component": images, scripts, |
|
89 |
stylesheets, etc.) ETags were added to provide a mechanism for validating |
|
90 |
entities that is more flexible than the last-modified date. An `ETag` is a |
|
91 |
string that uniquely identifies a specific version of a component. The only |
|
92 |
format constraints are that the string be quoted. The origin server specifies |
|
93 |
the component's `ETag` using the `ETag` response header. |
|
94 |
|
|
95 |
```http |
|
96 |
HTTP/1.1 200 OK |
|
97 |
Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT |
|
98 |
ETag: "10c24bc-4ab-457e1c1f" |
|
99 |
Content-Length: 12195 |
|
100 |
``` |
|
101 |
|
|
102 |
Later, if the browser has to validate a component, it uses the `If-None-Match` |
|
103 |
header to pass the `ETag` back to the origin server. If the ETags match, a 304 |
|
104 |
status code is returned reducing the response by 12195 bytes for this |
|
105 |
example. |
|
106 |
|
|
107 |
```http |
|
108 |
GET /i/yahoo.gif HTTP/1.1 |
|
109 |
Host: us.yimg.com |
|
110 |
If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT |
|
111 |
If-None-Match: "10c24bc-4ab-457e1c1f" |
|
112 |
HTTP/1.1 304 Not Modified |
|
113 |
``` |
|
114 |
|
|
115 |
The problem with ETags is that they typically are constructed using attributes |
|
116 |
that make them unique to a specific server hosting a site. ETags won't match |
|
117 |
when a browser gets the original component from one server and later tries to |
|
118 |
validate that component on a different server, a situation that is all too |
|
119 |
common on web sites that use a cluster of servers to handle requests. By |
|
120 |
default, both Apache and IIS embed data in the ETag that dramatically reduces |
|
121 |
the odds of the validity test succeeding on web sites with multiple servers. |
|
122 |
|
|
123 |
The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a |
|
124 |
given file may reside in the same directory across multiple servers, and have |
|
125 |
the same file size, permissions, timestamp, etc., its inode is different from |
|
126 |
one server to the next. |
|
127 |
|
|
128 |
IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is |
|
129 |
Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track |
|
130 |
configuration changes to IIS. It's unlikely that the ChangeNumber is the same |
|
131 |
across all IIS servers behind a web site. |
|
132 |
|
|
133 |
The end result is ETags generated by Apache and IIS for the exact same |
|
134 |
component won't match from one server to another. If the ETags don't match, the |
|
135 |
user doesn't receive the small, fast 304 response that ETags were designed for; |
|
136 |
instead, they'll get a normal 200 response along with all the data for the |
|
137 |
component. If you host your web site on just one server, this isn't a problem. |
|
138 |
But if you have multiple servers hosting your web site, and you're using Apache |
|
139 |
or IIS with the default ETag configuration, your users are getting slower |
|
140 |
pages, your servers have a higher load, you're consuming greater bandwidth, and |
|
141 |
proxies aren't caching your content efficiently. Even if your components have a |
|
142 |
far future Expires header, a conditional GET request is still made whenever the |
|
143 |
user hits Reload or Refresh. |
|
144 |
|
|
145 |
If you're not taking advantage of the flexible validation model that ETags |
|
146 |
provide, it's better to just remove the ETag altogether. The Last-Modified |
|
147 |
header validates based on the component's timestamp. And removing the ETag |
|
148 |
reduces the size of the HTTP headers in both the response and subsequent |
|
149 |
requests. This Microsoft Support article describes how to remove ETags. In |
|
150 |
Apache, this is done by simply adding the above line to your Apache |
|
151 |
configuration file. |
|
152 |
|
|
153 |
|
|
154 |
### Gzip Components |
|
155 |
|
|
156 |
Compression reduces response times by reducing the size of the HTTP response. |
|
157 |
|
|
158 |
Starting with HTTP/1.1, web clients indicate support for compression with the |
|
159 |
Accept-Encoding header in the HTTP request. |
|
160 |
|
|
161 |
``` |
|
162 |
Accept-Encoding: gzip, deflate |
|
163 |
``` |
|
164 |
|
|
165 |
If the web server sees this header in the request, it may compress the response |
|
166 |
using one of the methods listed by the client. The web server notifies the web |
|
167 |
client of this via the Content-Encoding header in the response. |
|
168 |
|
|
169 |
``` |
|
170 |
Content-Encoding: gzip |
|
171 |
``` |
|
172 |
|
|
173 |
Gzip is the most popular and effective compression method at this time. It was |
|
174 |
developed by the GNU project and standardized by RFC 1952. The only other |
|
175 |
compression format you're likely to see is deflate, but it's less effective and |
|
176 |
less popular. |
|
177 |
|
|
178 |
Gzipping generally reduces the response size by about 70%. Approximately 90% of |
|
179 |
today's Internet traffic travels through browsers that claim to support gzip. |
|
180 |
If you use Apache, the module configuring gzip depends on your version: Apache |
|
181 |
1.3 uses `mod_gzip` while Apache 2.x uses `mod_deflate`. |
|
182 |
|
|
183 |
There are known issues with browsers and proxies that may cause a mismatch in |
|
184 |
what the browser expects and what it receives with regard to compressed |
|
185 |
content. Fortunately, these edge cases are dwindling as the use of older |
|
186 |
browsers drops off. The Apache modules help out by adding appropriate Vary |
|
187 |
response headers automatically. |
|
188 |
|
|
189 |
Servers choose what to gzip based on file type, but are typically too limited |
|
190 |
in what they decide to compress. Most web sites gzip their HTML documents. It's |
|
191 |
also worthwhile to gzip your scripts and stylesheets, but many web sites miss |
|
192 |
this opportunity. In fact, it's worthwhile to compress any text response |
|
193 |
including XML and JSON. Image and PDF files should not be gzipped because they |
|
194 |
are already compressed. Trying to gzip them not only wastes CPU but can |
|
195 |
potentially increase file sizes. |
|
196 |
|
|
197 |
Gzipping as many appropriate file types as possible is an easy way to reduce |
|
198 |
page weight and accelerate the user experience. |
|
199 |
|
|
200 |
|
|
201 |
### Cache busting |
|
202 |
|
|
203 |
A first-time visitor to your page may have to make several HTTP requests, but |
|
204 |
by using the Expires header you make those components cacheable. This avoids |
|
205 |
unnecessary HTTP requests on subsequent page views. Expires headers are most |
|
206 |
often used with images, but they should be used on all components including |
|
207 |
scripts, stylesheets, etc. |
|
208 |
|
|
209 |
Traditionally, if you use a far future Expires header you have to change the |
|
210 |
component's filename whenever the component changes. |
|
211 |
|
|
212 |
The H5BP `.htaccess` has built-in filename cache busting. To use it, uncomment |
|
213 |
the relevant lines in the `.htaccess` file. |
|
214 |
|
|
215 |
Doing so will route all requests for `/path/filename.20120101.ext` to |
|
216 |
`/path/filename.ext`. To use this, just add a time-stamp number (or your own |
|
217 |
numbered versioning system) into your resource filenames in your HTML source |
|
218 |
whenever you update those resources. |
|
219 |
|
|
220 |
#### Example: |
|
221 |
|
|
222 |
```html |
|
223 |
<script src="/js/myscript.20120305.js"></script> |
|
224 |
<script src="/js/jqueryplugin.45.js"></script> |
|
225 |
<link rel="stylesheet" href="css/somestyle.49559939932.css"> |
|
226 |
<link rel="stylesheet" href="css/anotherstyle.2.css"> |
|
227 |
``` |
|
228 |
|
|
229 |
**N.B. You do not have to rename the resource on the filesystem.** All you have |
|
230 |
to do is add the timestamp number to the filename in your HTML source. The |
|
231 |
`.htaccess` directive will serve up the proper file. |
|
232 |
|
|
233 |
Traditional cache busting involved adding a query string to the end of your |
|
234 |
JavaScript or CSS filename whenever you updated it. |
|
235 |
|
|
236 |
```html |
|
237 |
<script src="/js/all.js?v=12"></script> |
|
238 |
``` |
|
239 |
|
|
240 |
However, as [Steve Souders](http://stevesouders.com/) explains in [*Revving |
|
241 |
Filenames: don’t use |
|
242 |
querystring*](http://www.stevesouders.com/blog/2008/08/23/revving-filenames-dont-use-querystring/), |
|
243 |
the query string approach is not always reliable for clients behind a Squid |
|
244 |
Proxy Server. |
|
245 |
|
|
246 |
|
|
247 |
## Trailing slash redirects |
|
248 |
|
|
249 |
Trailing slash redirects can be done by adding one of the options below in `.htaccess`. |
|
250 |
|
|
251 |
### Option 1 |
|
252 |
Rewrite `domain.com/foo` -> `domain.com/foo/`. |
|
253 |
|
|
254 |
```apache |
|
255 |
RewriteCond %{REQUEST_FILENAME} !-f |
|
256 |
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/|#(.*))$ |
|
257 |
RewriteRule ^(.*)$ $1/ [R=301,L] |
|
258 |
``` |
|
259 |
|
|
260 |
### Option 2 |
|
261 |
Rewrite `domain.com/foo/` -> `domain.com/foo` |
|
262 |
|
|
263 |
```apache |
|
264 |
RewriteRule ^(.*)/$ $1 [R=301,L] |
|
265 |
``` |
|
266 |
|
|
267 |
Here are some tips to show you how to integrate the rewrite rules with |
|
268 |
different CMS tools. There are four areas you need to look out for: |
|
269 |
|
|
270 |
### 1. Keep a backup |
|
271 |
|
|
272 |
If you use trailing slash redirects on an existing site, always keep a backup |
|
273 |
of your `.htaccess` and test thoroughly on your staging server before using it on |
|
274 |
a production server. |
|
275 |
|
|
276 |
### 2. Don't replace existing rules, merge |
|
277 |
|
|
278 |
For example, if you use CodeIgniter you may have existing URL rewrite rules like: |
|
279 |
|
|
280 |
```apache |
|
281 |
RewriteCond %{REQUEST_FILENAME} !-f |
|
282 |
RewriteCond %{REQUEST_FILENAME} !-d |
|
283 |
RewriteRule ^(.*)$ index.php/$1 |
|
284 |
``` |
|
285 |
|
|
286 |
Merge the above with H5BP rules below: |
|
287 |
|
|
288 |
```apache |
|
289 |
RewriteCond %{REQUEST_FILENAME} !-f |
|
290 |
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/|#(.*))$ |
|
291 |
RewriteRule ^(.*)$ $1/ [R=301,L] |
|
292 |
``` |
|
293 |
|
|
294 |
### 3. Be careful of the order |
|
295 |
|
|
296 |
Make sure you test thoroughly in your staging environment. For the above |
|
297 |
example, the order is add trailing slash first, and add your existing rule |
|
298 |
after: |
|
299 |
|
|
300 |
```apache |
|
301 |
# this adds trailing slash |
|
302 |
RewriteCond %{REQUEST_FILENAME} !-f |
|
303 |
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/|#(.*))$ |
|
304 |
RewriteRule ^(.*)$ $1/ [R=301,L] |
|
305 |
|
|
306 |
# this gets rid of index.php |
|
307 |
RewriteCond %{REQUEST_FILENAME} !-f |
|
308 |
RewriteCond %{REQUEST_FILENAME} !-d |
|
309 |
RewriteRule ^(.*)$ index.php/$1 |
|
310 |
``` |
|
311 |
|
|
312 |
### 4. Double-check `RewriteBase` path is correct |
|
313 |
|
|
314 |
Make sure your `RewriteBase` path points to the correct location and sits above |
|
315 |
any rewrite rules. This usually happens to those have WordPress and ran the |
|
316 |
auto install. For instance, if you have a site at `example.com/blog`, your |
|
317 |
RewriteBase may look like: |
|
318 |
|
|
319 |
```apache |
|
320 |
RewriteBase /blog/ |
|
321 |
``` |
|
322 |
|
|
323 |
If you already have a working RewriteBase, keep that and don't remove it. |