January 19, 2007

Intro to Mod Rewrite for SEO URLs

You will find other articles relevant to this document in these sections:
Richard Lee @ 10:37 am

Mod Rewrite is an Apache module commonly used with PHP to create Search Engine friendly URLs. Essentially this module lets us mask ugly querystrings with much more meaningful URLs. For example take a products page in a database driven catalog:

http://www.mysite.com.au/catalog/product.php?pid=123

At the moment this URL tells us nothing about the destination page and searchbots lare deterred from indexing such URLs - Wouldn’t it be better if we included the product name?

http://www.mysite.com.au/catalog/products/t-shirt/123

Much better :) . Using the power of regular expressions we can extract the information from our new URL and pass this information onto the correct PHP page behind the scenes.

So how do we do it? In the case of our new URL all we need to do is extract the last part which carries the product id and pass this onto the product.php page. We can do this easily in a .htaccess file:

First we enable the Mod Rewrite engine using RewriteEngine on:

# HTACCESS
 
RewriteEngine On # enable the module

Then we set the base url we will be writing from using RewriteBase base

RewriteBase /catalog/ # base URL which in our case is 'catalog' since this is where our app is sitting

Now we do our Rewrite using the RewriteRule pattern substitution directive - pattern is our Regular Expression which we use to evaluate the incomming URL, substitute is the string which is substituted for (or replaces) the original URL for which Pattern matched.

RewriteRule ^products/[a-zA-Z0-9-_]+/([0-9]+)$ product.php?pid=$1
# End HTACCESS

The Regular Expression:

- Caret ^ and dollar $ sign characters signify the start and end of our pattern string

- Square brackets specify ranges of allowed characters, such as A to Z, 0 to 9

- Round brackets are used to “capture” parts matched in our pattern - in this case the product id - which we later reference in our substitute URL using back referencing $1
(ref numbers are indexed according to each set of round brackets i.e. if we had enclosed the product name match in rounded brackets this would be $1 and the product id would be $2)
Easy enough? This is a relatively simple rewrite and I have explained it in fairly layman terms, more complicated rewrites require some knowledge of Regular Expressions. If you haven’t played with Regular Expression I highly recommend you checkout Wikipedia, DevShed articles and the cheat sheets supplied by ILoveJackDaniels.com.