You are here

Clean URLs

By  Nick Kasprak  on June 04, 2012 8:30 PM

Note: I've since upgraded the site to Drupal, so the specifics below are no longer accurate.

Part of the reason I built this website was to practice my web development skills and experiment with new techniques. So, instead of installing some off-the-shelf blogging software, I wrote my own bare-bones blogging platform. It's still a work in progress, but I think I've got all the basics covered, and I learned a lot in the process - especially about regular expressions, mod_rewrite and clean URLs, all of which are subjects I'm relatively new to.

Clean URLs (for example, this post's URL is www.nickkasprak.com/1003/clean-urls
rather than www.nickkasprak.com/index.php?ind_post_id=1003) have been around for a while, and there are two reasons to use them - first, they disguise the server platform you're using (in my case, PHP) and so make it slightly harder for hackers to mess up your site, and second, you can switch platforms and keep the same URLs for your content. Since I don't have any .php file extensions in my URLs, I could switch to Python and write things in such a way that the URL for each post stays the same - important for SEO.

I found that it was easiest to set up URL rewriting rules so that the "clean" text with dashes, after the blog ID, doesn't actually matter at all - it's the number before that tells the server which page to load. I then later noticed that that all of Gawker's various sites have similar URLs, so I tried an experiment with an article and wrote some gibberish into the clean portion of the URL. The article still loaded, so they've clearly set things up more or less the way I have. Since I'm self-taught with all of this stuff and don't have any formal training it's always gratifying to find out that I'm doing things correctly.