Putting Your .htaccess File to Work For You

posted in Getting Started |

A .htaccess file is a simple little file that sits on the root level of your site and can do all sorts of nifty things for you. It’s kind of a freak in the world of HTML, as your site can function just fine without one, so it’s not exactly necessary, but it can also do heavy, important lifting for you, if you use it correctly.

Here’s the official definition of a .htaccess file from Apache, as well as when and when not to use one. One key thing to keep in mind is that the below is for Unix/Linux servers with Apache set up. That’s by far the most common setup for hosting accounts these days, so if you’re not sure what you’re running, odds are that you’re good to go.

For our purposes, we’re not going to worry so much about the permissioning side of .htaccess files, as far as protecting directories, etc. We’re just going to focus on ways to use the .htaccess file to make sure that our content gets indexed correctly, that error pages are handled efficiently, and that we keep spam comments down to a minimum.

First we have to create a .htaccess file. One thing to keep in mind is that it’s a different type of file than others, as far as it’s name and format. It’s not short for anything and there’s nothing extra tacked onto the file name. It truly is .htaccess, nothing more, nothing less.

The easiest way to create one is to open up Notepad and create and save a file that’s called .htaccess.txt. (The .txt part is necessary and unavoidable but we’ll remove it later after uploading the file via FTP to the root of our website.)

So you’ve created the empty file. The first step is to set up how you want your site to handle pages that are accessed and produce an error. Instead of users seeing an unhelpful 404 error page if they try to access a file that no longer exists, you can use the .htaccess file to automatically redirect them somehwhere else, usually your home page.

Here’s a sample bit of code that you’d insert into your .htaccess file to handle error pages:

ErrorDocument 400 /index.html
ErrorDocument 401 /index.html
ErrorDocument 403 /indexhtml
ErrorDocument 404 /index.html
ErrorDocument 500 /index.html

What does that do? Anytime a user on your site encounters an error (400 and 404 errors are the most likely culprit), instead of seeing a stock error page, they’re instead automatically redirected to your home page, which in this example is located at index.html. If your home page was located at index.php, you’d insert that instead.

If you’d rather create a custom error page (such as one that said  ”This page no longer exists, but you can find great information about Wombats on the site here, as well as here.), then you could change index.html in the above example to the name of the custom error page, something like error.html.

You can also use your .htaccess file to ensure that search engines spider and index the preferred version of your domain name. This gets into the world of SEO and PageRank, but the short version is that search engines see gadooney.com and www.gadooney.com as two separate sites. While users can type either and get to your content fine, you prefer to have search engines to simply pick one and just use it, to give you maximum traction in search engine results.

Making that happen is pretty simple, as you just have to enter the following code into your .htaccess file:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^YourSite\.com [nc]
RewriteRule (.*) http://www.YourSite.com/$1 [R=301,L]

Obviously you need to replace the “YourSite” with your actual site information. Google also lets you set your preferred domain via their Webmaster Tools interface, which has some other useful goodies as well.

If you’ve changed the location of a file or an entire site, you can also use the Rewrite command to point to the new location, with something like the following (replacing the directory info and html file locations with the specific info for your site and files):

Redirect /OldDir/old.html http://www.site.com/NewDir/new.html

For those using WordPress who get hammered with spam, you can use the .htaccess file to cut a lot of that out at the source, by adding the following to your .htaccess file (with credit to JohnChow.com for pointing out this in a recent post):

RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} .wp-comments-post\.php*
RewriteCond %{HTTP_REFERER} !.*johnchow.com.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule (.*) ^http://www.thetechzone.com/$ [R=301,L]

Again, replace the details with those for your own site. All the above code really does is to make sure that a person leaving a comment has a referral, which should always be the case with legitimate comments. Spam comments often don’t have a referral, so disallowing anyone without a referral from commenting impacts only spam comments.

You can pick and choose from the options above, and include the code you’d like to use in your .htaccess file. Once you’re ready to roll, save it. At this point, it’s still a .htaccess.txt file. Upload it to your server via FTP and put it in the root level of your site. Be sure to upload it in ASCII mode, not binary.

Once it’s uploaded, you’ll need to change the name of the file. Change the name to it’s true name, which is “.htaccess”. Don’t out a period after it, don’t put in quotes, don’t add an extension, simply type in “.htaccess” as the new file name.

You may also need to CHMOD the .htaccess file to 644. Do a Google search on “.htaccess chmod 644″ if you need instructions for that, as this is getting long enough as is.

Hopefully that should give you a basic introduction to some uses for the .htaccess file. Poke around some on your own, though, as it can do a lot more than what’s touched on above, and is a pretty powerful little file that’s often overlooked.

 

 

This entry was posted on Thursday, March 1st, 2007 at 11:21 am and is filed under Getting Started. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

There is currently one response to “Putting Your .htaccess File to Work For You”

Why not let us know what you think by adding your own comment! Your opinion is as valid as anyone elses, so come on... let us know what you think.

  1. 1 On March 4th, 2007, Tim. Stanford said:

    Wait until you read this article that shows every single Apache Status Code and the actual headers and src returned from that error! Force Apache to output any HTTP Status Code with ErrorDocument

    I setup an automated system to view all 57 Apache Response codes and ErrorDocuments, saving the headers and returned content for future reference. Use this page as a reference when designing scripts that use headers. Ex: 404 Not Found, 200 OK, 304 Not-Modified, 506 Service Temporarily Unavailable, etc.

    When a Status code is encountered, Apache serves the header and the ErrorDocument for the error code. So if you can see any Header and ErrorDocument by causing that error on Apache.

    For instance, if you request a file that doesn’t exist, a 404 Not Found is issued and the corresponding ErrorDocument is served with the 404 Not Found Header. So we can see what Apache 404 Errors and Response Codes look like, but how do we cause errors for the 56 other Apache Response Codes?

Leave a Reply