For Pete's sake, encode your URL-valued attributes!
Posted Aug 26, 2004 at 12:07 AM
As we all know, ampersands (&) serve as query string parameter delimiters in URL’s. One all too common mistake is to not properly encode them when they appear in HTML. The most common occurrence of this problem is in the href and src attributes of various tags, and this post sets out to discuss the problem, identify the fix, and explain why fixing it is important.
What’s the big deal?
Let’s consider a fictitious relative url that I may use inside my music library, to find Blues artists, for example: href=” /music/getmusic.aspx?type=artist&genre=blues®=1”
We’re passing three parameters: type, genre and reg. Now let’s consider the same url, properly encoded and now valid: href=”/music/getmusic.aspx?type=artist&genre=blues&reg=1”
The problem with the first approach (other than being invalid XHTML) is that my query string parameters look remarkably like HTML character entity references. These entities should be avoided as query string parameter names. Indeed, this is the problem: some browsers will actually resolve the entity references when generating the request to the URL! Not just old browsers, either, if that mattered (and it doesn’t). Current browsers like Mozilla, Firefox, Safari and IE/Mac do this today. Lucky for us, type and genre are not entity references. As for reg, whoops.
Server confusion and silly solutions
The browser converts the unencoded ampersand followed by a character entity reference (like reg is in the above example) to what it thinks is the appropriate character (like ® in this case), but this is not what your server expects. You have a few choices here:
- Handle the incoming request incorrectly, and hope that they happen late at night when no one’s around.
- Write a smart parser that unencodes the resolved character entity reference, and always use that, hoping you get it right and that your mapping of variables you need to do this for is correct.
- Never use character entity references for query string names, and hope that this is always the case, in infinitum.
Or better yet…
Always just encode URL’s in attribute values!
About this page
This page contains a single post from Daniel Boerner's blog, of which Boot Camp + Windows Vista = no more Airport Extreme reboots is the latest post.
Are there more posts like this one?
Possibly. Within this blog, this post is categorized under webdev and it was posted on August 26, 2004. Those would be good places to start looking for related posts.
Next post (newer)
Verification for Javascript Files
This post is closed to new comments.