How to redirect domains using .htaccess

Web hosting, SEO, etc... related
Post Reply
User avatar
Neo
Site Admin
Site Admin
Posts: 2642
Joined: Wed Jul 15, 2009 2:07 am
Location: Colombo

How to redirect domains using .htaccess

Post by Neo » Sat Mar 13, 2010 12:30 pm

So you decided your domain doesn’t fit you as well as you’d like. And you are also afraid you’ll be losing all the old links and search engine ranking if you move to a new one. Maybe you also want to use the current domain for a different project that suits it best. Or maybe you just want to change the structure of your URLs or you want a different URI structure in the new domain. Let’s see how to write some htaccess rules so Apache can do the magic for you.

Redirect the whole domain
Redirecting a whole domain with .htaccess rules is easy. All you have to do is to put the rules below in an file named .htaccess in the root directory for your old domain.

You can use either,

Code: Select all

redirect 301 / http://www.newdomain.com/
OR

Code: Select all

RewriteEngine on
RewriteRule ^(.?)$ http://www.newdomain.com/$1 [R=301,L]
OR

Code: Select all

RewriteEngine on 
RewriteCond %{HTTP_HOST} !^www\.newdomain\.com
RewriteRule (.*) http://www.newdomain.com/$1 [R=301,L]
Last one is quite good.

Let’s see how the magic happens. The first line is merely to turn on the Rewrite engine on apache and enable the rewrite rules to be interpreted.

The second line is composed of 4 parts.
  1. The RewriteRule keyword
  2. A regular expression (regex) to match the original URI
  3. The URI to be redirected to and
  4. Parameters
The regular expression in this case is ^(.*)$. The dot, in a regular expression matches a character, any character except the newline. That means that any letter, number or symbol container in a URL will be matched. The asterisk (?) means that you want to repeat the regex it affects zero or more times.

Simply put, (.?) matches any amount of non-newline characters, including zero.

Additionally, the carat (^) represents the start of a line and the dollar sign ($) represents the end, so the whole expression
matches a line of zero or more non-newline characters, i.e., the whole path of your URI inside your domain.

The third part is the URI to redirect to. As you see, is pretty straightforward, the only tricky bit is the $1.

The reason for it is that you don’t want to redirect any URI in the old domain to the root of your new domain. What you want is to redirect each page for the equivalent in the new one. For instance this URI http://www.oldomain.com/path/to/page should redirect to http://www.newomain.com/path/to/page.

Now, for every regular expression inside parenthesis the text it matches is stored for use in the third part as a variable. The variables are called $1, $2, $3 and so on. We have only one Regex matching the whole path (remember the domain name is not included)
so we have the whole path stored in $1.

Altering the URI structure
It may be the case that you don’t like the current structure of your URI’s. If your blog has URIs like this http://www.olddomain.com/blog/category/ ... post-title

Say we decide dates are not that important, we only want to keep the year, and we want it separated by a slash, rather than a dash. Additionally, we want to get rid of the ‘blog’ keyword, since the whole domain is for the blog. The following rule will do the trick.

Code: Select all

RewriteEngine on
RewriteRule ^blog/(.?)/([0-9][0-9][0-9][0-9])-([0-9][0-9])-([0-9][0-9])-(.?)$ http://www.newdomain.com/$1/$2/$5 [R=301,L]
Here we have 5 matches being put into variables, one for each pair of parenthesis.
The category ($1)
Matches the first set of non-newline characteres after ‘blog/’ and the before the following slash.
The Year ($2)
Matches 4 consecutive digits after the second slash and before the first dash after that
The Month ($3)
Matches 2 consecutive digits between the first and second dashes after the second slash
The day ($4)
Matches 2 consecutive digits between the second and third dashes after the second slash
The Post Slug ($5)
Matches all the rest of the line
Before we proceed, notice a peculiarity here. Because the rule starts with the carat (^) and ends with the dollar sign ($) you are matching the whole line in this format. If the URI doesn’t start exactly with “blog/” and has the exact amount of slashes and dashes we specified it won’t be a match and will be ignored. Also, the category may be empty, as long as the two slashes are there, but the digits must be present in exact amounts. That, of course, is to be expected of any previously valid URL on the domain.
Now that we have the five variables with the orginal information parsed, we can use them as we want. We decided to get rid of the blog part (which we are matching literally, instead of via regex, because it’s too simple), month and day and reorganize category, year and post slug. We will use the first, second and fifth variables and simply ignore the rest.

Code: Select all

http://www.newdomain.com/$1/$2/$5/
If you don’t want to redirect to a new domain but just rearrange the URIs within it, you may simply omit the domain on the redirection and use these rules

Code: Select all

RewriteEngine on
RewriteRule ^blog/(.?)/([0-9][0-9][0-9][0-9])-([0-9][0-9])-([0-9][0-9])-(.?)$ /$1/$2/$5 [R=301,L]
Reusing the domain for a different project
More often than not, after a domain redirect, the old domain will remain merely as an entry point for redirection. People do that for several reasons,
one of them is that readers can be confused to find the old URLs being redirected to a new site and the domain being used for a new site.

But deciding whether or not to reuse your domain is outside the scope of this post. You may be splitting your site in two and you’ll leave part of it in the old domain or after a few years, links to your old domain may be minimal or, simply, you may have a better strategy that I can thing while writing this.

In any case, if you want to reuse your domain at some point, you must be aware of a few things.

All the URI’s redirected are taken and can’t be reused without disabling the redirection. You guessed that, of course, but you have to keep in mind that when you are using regexes for your redirects you are matching a whole class of URIs, even some that have not being used but match the pattern.

If you hurried up and used the first rule (repeated below) you are matching every single URI within the old domain and it means that any content there will be unreachable because the redirection rules will take precedence.

Code: Select all

RewriteEngine on
RewriteRule ^(.?)$ http://www.newdomain.com/$1 [R=301,L]
If you used the second rule (repeated below) you are targetting a much more specific set of URIs. Anything that doesn’t have the exact format described
(/blog/[characters]/[4digits]-[2digits]-[2digits]-[characters]) will be reachable.

Code: Select all

RewriteEngine on
RewriteRule ^blog/(.?)/([0-9][0-9][0-9][0-9])-([0-9][0-9])-([0-9][0-9])-(.?)$ http://www.newdomain.com/$1/$2/$5 [R=301,L]
You’ll be able to even run a blog under http://www.olddomain.com/blog/ as long as you don’t use the exact same URI structure. Again whether or not this is a good idea is up to you to decide.

301 or 302 redirects
Finally, you may have noticed that one of the parameters on the last part of the rules is the number 301. This is the type of redirection you are performing. You can use either 301 or 302 as redirection codes, 301 stands for a permanent redirection, whereas 302 stands for a temporary redirection.

Redirection codes are informed to user agents (browsers), search bots and anyone else who may want to know. While both redirection codes will take you to the new URI, there are important consequences of choosing either.

Because user agents and bots are informed of the redirection code, they can take actions based on them.

A temporary redirection is equivalent to an “Out For Lunch” sign. For some reason, the page is being redirected, but you are confirming the user is on the permanent URI for the resource. Bear in mind that “temporary” doesn’t imply in any length of time, you are only saying “Yes, this is the URI. We are operating there right now, but come back here the next time”.

Reasons for that could be:
Maintenance page
You may want to redirect the whole traffic of your page for a few minutes or hours while you are performing an upgrade

Feedburner redirect
If you use feedbuner, you may want people to subscribe to an URI within your domain, but redirect to feedburner. If one day you want to use a different service, all your subscribers use your URI, so they will be automatically redirected.

A permanent redirection is stronger, it is equivalent to say “We moved, we are now operating there and we are not coming back. Next time you may prefer to go there straight away”. The consequence of that is that search bots, intelligent user agents, social bookmarking sites and whoever else may care, can update their links to the new location.

This is what is going to make possible for Search Engines to transfer the status of your site to the new structure.
Post Reply

Return to “Web Related”