Hashbangs – The Future of URLs or The End of The Internet?

Quick – what’s a URL? Most of you would point to that string of text at the top of your browser that defines the location for this page. But URLs represent a lot of things: references to pages, pointers to content, and the foundations of links. They’re the fiber of the web, and the entire notion of HTTP is about pages pointing to pages using URLs.

However, a new approach to building websites is threatening to turn this notion of URLs on its head. Two simple characters – #!  (called either a hashbang or a shebang) – are creating more trouble than anything seen in years. Adding those to a URL makes it something else entirely, but to understand why we need to first go over a couple of web fundamentals.

Many of you are probably aware of the underlying process behind opening a URL, but for the uninitiated here’s a very simple summary. When my browser wants to open a page, it takes the URL and splits out the server (blog.utest.com) and the path (/hashbangs-the-future-of-urls-or-the-end-of-the-internet/2011/06/). The browser connects to the server using HTTP and requests whatever content is at that path. The server then sends back the data. Over the course of loading a single webpage, the browser will use this methodology to first load the HTML and then load any style sheets, images, Javascript, and more.

With the invention of AJAX, not every webpage has to be loaded like this. Javascript code running in a browser can download snippets of HTML and update portions of a page on-the-fly, giving websites the illusion of dynamic controls and content.

What if you configured Javascript to load all the content for the entire website? In other words, what if your HTML was just a skeleton, and then you had Javascript code download all the content separately? It sounds strange, but the benefit is a boost in performance. The HTML framework gets downloaded once, and then any other pageviews on that site are loaded via Javascript. There’s only one problem – if a getting to a page is really just executing a string of Javascript code, how do you create a link to that page?

Enter the hashbang. In a URL, anything after ‘#’ gets ignored by the web browser and the web server. If I load http://twitter.com/#!/utest (Twitter uses hashbangs), then the web server for Twitter never sees the “/utest” part of the request. The browser simply asks for twitter.com and ignores the rest. What Twitter sends back is a basic HTML skeleton and a big chunk of Javascript.

That chunk of Javascript runs in the browser and it knows all. It sees the text after the ‘#!’ and knows that it should request that content separately from Twitter’s APIs. After the Javascript downloads the content, it updates the HTML skeleton (or DOM) and inserts the actual Twitter content. Click on a link within Twitter, and your browser just sees you reloading twitter.com. But that same Javascript will jump in and dynamically load whatever new content you request. The bandwidth savings for Twitter are enormous, and it also results in a faster and snappier user experience.

So what’s the downside? In order to use Twitter, I must have Javascript enabled. What’s worse, if I want to write an application to automatically load or read Twitter pages, I would need to give it a Javascript interpreter to work. Also, if there’s any kind of bug in the core Javascript, then an entire site will fail (just ask Gawker about their experiences with hashbangs).

What does this mean for the humble URL? Well, hashbangs are something new. They no longer represent pages, but are instead Javascript commands. The downside is that many of the conventions of HTTP, like 404 error codes and 301 redirects, are no longer relevant. On the other hand, the user experience should be somewhat improved.

Of course, many of the upcoming changes in HTML5 may make this all irrelevant anyway. Only time will tell.

Essential Guide to Mobile App Testing

Leave a Reply

Your email address will not be published. Required fields are marked *