mod_rewrite
Uses • • • • •
Obscure original URLs Clean URLs SEO (Seriously.) Redirection Access control
Where can I place rewrite rules? • • • •
httpd.conf (or included config files) VirtualHost Directory* .htaccess
*Certain restrictions apply.** **Use .htaccess files
Directives • • • • • • •
RewriteEngine RewriteOptions RewriteRule RewriteCond RewriteLog RewriteBase ...and more! (that we won’t cover)
Exhibit A #
Default WordPress rewrite rules
RewriteEngine On RewriteBase / RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L]
Exhibit A, Explained #
Default WordPress rewrite rules
# Turn rewrite engine on RewriteEngine On # Base all rewrites on ‘/’ URL RewriteBase / # If the requested file does not exist RewriteCond %{REQUEST_FILENAME} !-f # ...or if no directory exists... RewriteCond %{REQUEST_FILENAME} !-d # Rewrite the URL to /index.php RewriteRule . /index.php [L]
RewriteEngine on | off • •
Enables or disables the runtime rewriting engine Required inside VirtualHost and .htaccess
RewriteRule pattern substitution [flags]
• • • •
Can occur more than once Processed in order First rule processed on URL path Subsequent rules processed on previous output
•
Think of chaining commands together in bash
RewriteRule pattern substitution [flags]
• • •
PCRE (Perl-Compatible Regular Expressions) Matches the URL path RewriteCond required to match anything else
Regex
[\t])*(?:[^()<>@,;:\\" .\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\ [\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[\t])*(?:[^()<>@,;:\\".\ [\] \000-\031]+(?: (?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?: (?:\r\n)?[\t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))| \[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[\t])*)?(?:[^()<>@,;:\\".\[\] \0 00\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\ .|(?:(?:\r\n)?[\t]))*"(?: (?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z| (?=[\["()<>@,;:\\".\[\]]))|"(? :[^\"\r\\]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r \n)?[ \t])* (?:[^()<>@,;:\\".\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\ [([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[\t])*(?:[ ^()<>@,;:\\".\[\] \000-\031]+ (?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\> (?:(?:\r\n)?[\t])*)(?:,\s*( ?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["() <>@,;:\\ ".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:( ?:\r\n)?[\t]) *(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\ \]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t ])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]\000\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*) (? :\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\ \".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*))*|(?: [^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r \n)?[\t])+|\Z|(?=[\["()<>@,;:\\".\[\ ]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t])*) *\<(?:(?:\r\n) ?[\t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\["()<>@,;:\ \".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*)(?:\.(?:(?:\r\n) ?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)? [\t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["() <>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*)(?:\.(?:(?:\r\n)?[ \t] )*(?:[^()<>@,;:\ \".\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?: \r\n)?[ \t])*))*)*:(?:(?:\r\n)?[\t])*)? (?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(? =[\["()<>@,;:\\". \[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?: \r \n)?[\t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?: [^\"\r\\]|\\.|(?:(?:\r\n)?[\t]))*"(?:(?:\r\n)?[ \t]) *))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)? [\t])*)(?:\ .(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["() <>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*))*\>(?:( ?:\r\n)?[ \t])*))*)?;\s*)
Regex
(RFC 822 Compliant Email Validation)
Regex
®
It’s Not That Hard™
RewriteRule pattern substitution [flags] Can be one of several things:
• • • •
file-system path URL path Absolute URL - [dash]
(no substitution)
RewriteRule pattern substitution [flags]
• • •
Affect behavior of the Rule or Condition Contained in square brackets Comma-separated
[NC] [NC, QSA, L]
RewriteRule pattern substitution [flags] C
Chain this rule to the next rule
E
Set environment variable
F
403 Forbidden Status
G
410 Gone Status
H
Specify handler
L
Don’t process rules after this one
NC QSA R[=301]
Case-insensitive Append query string Redirect (R=HTTP Status Code)
RewriteCond variable pattern [flags]
• • •
HTTP Server Variables (HTTP_HOST, REQUEST_URI) Pattern that must match the given variable Optional flags that change the behavior, just like RewriteRule
RewriteCond variable pattern [flags] HTTP_REFERER HTTP_HOST REMOTE_ADDR REQUEST_METHOD SERVER_PORT HTTP_USER_AGENT REQUEST_URI etc.
Exhibit A, Explained (Again) #
Default WordPress rewrite rules
# Turn rewrite engine on RewriteEngine On # Base all rewrites on ‘/’ URL RewriteBase / # If the requested file does not exist RewriteCond %{REQUEST_FILENAME} !-f # ...or if no directory exists... RewriteCond %{REQUEST_FILENAME} !-d # Rewrite the URL to /index.php RewriteRule . /index.php [L]
non-www to www RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com$ [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
Pattern
Flags Substitution
non-www to www RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com$ [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
Match this pattern Pattern inside parentheses is capture as variable ‘$1’
non-www to www RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com$ [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
Substitute it with “http://www.example.com”
non-www to www RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com$ [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
The pattern captured inside the parentheses
non-www to www RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com$ [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
Redirect with a 301 Code Make this the last RewriteRule
RewriteOptions inherit • •
Inherit the configuration of the parent
•
In per-directory context, conditions and rules of the parent directory's .htaccess configuration are inherited
In per-virtual-server context, maps, conditions and rules of the main server are inherited
Example in httpd.conf ...
RewriteRule ^(.*)$ index.php ...
ServerName zomgbacon.com DocumentRoot /home/bacon/public_html # Turn on the rewrite engine and inherit any rules RewriteEngine On RewriteOptions Inherit ...
RewriteBase URL-path • •
Sets the base URL for per-directory rewrites URLs are NOT directly related to physical filename paths
RewriteBase Example in .htaccess file # # # # #
/abc/def/.htaccess -- per-dir config file for directory /abc/def Remember: /abc/def is the physical path of /xyz, i.e., the server has a 'Alias /xyz /abc/def' directive e.g.
RewriteEngine On # let the server know that we were reached via /xyz and not # via the physical path prefix /abc/def RewriteBase /xyz # now the rewriting rules RewriteRule ^oldstuff\.html$
newstuff.html
RewriteLog log-path • • • •
Logs rewrites Level of logging can be tuned Relative paths are relative to DocumentRoot Absolute paths are...well, absolute.
RewriteLogLevel level • • • • •
Integer value 0-9 0 == disabled 9 == log nearly everything More verbose, greater impact on performance Higher levels == 2 or higher for debugging only
Logging Examples in httpd.conf
...
ServerName zomgbacon.com DocumentRoot /home/bacon/public_html # /home/bacon/public_html/rewrite.log RewriteLog rewrite.log # Make it semi-verbose RewriteLogLevel 5 # turn on rewrite engine and inherit rules RewriteEngine on RewriteOptions Inherit ...
fin