A blog post from danboe.net

Marking up text on a web page

Posted Aug 15, 2004 at 12:12 AM

Marking up what on a web page? Text? In the hopefully likely event that you have text on your page (it’s the stuff between the tags, objects and images), here are some techiques for structuring text, assuming you’re comfortable formatting it with CSS.

When I originally authored this post, I called it “Text formatting.” Whoops. That’s the whole point. It’s not about format, it’s about structure. We must unlearn what we have learned.

Focus on structure, not presentation

One of the core themes around this document is to build markup from a starting point that is structural and semantic and nature, not presentational. The point here consistently is that HTML pages developed according to web standards contain data that is surrounded by tags that convey structure and meaning, rather than giving a presentational instruction. The day has come when these presentational instructions can be managed and delivered outside of the markup, in CSS. Taking advantage of this forces a one-time rethinking around how we markup content today, and a close examination of why we use the approaches we do. The following outlines a simple case where such a change is necessary—emphasizing pieces of the content itself.

(Re)introducing em and strong

The W3C specifies the following ways to emphasize pieces of content:

  • em indicates that containing content should be emphasized.
  • strong indicates that containing content should be stronger emphasized.

That’s it. Pretty simple.

Use em, not i

  • Visual browsers will usually render text residing within a em tag as italic as an indication of its emphasis.
  • Assistive technologies will read such text appropriately, usually by adjusting the volume, pitch or playback rate as necessary.

Use strong, not b

  • Visual browsers will usually render text residing within a strong tag as bold as an indication of its emphasis.
  • Assistive technologies will read such text appropriately, usually by adjusting the volume, pitch or playback rate as necessary.

Use CSS when bolding and italicizing are desired for presentation, not emphasis

Of course, there are times when we may want to style certain sections of the site as either bold or italic, or even both. For example, bolding core navigation elements can draw more attention to them, or just help to visually differentiate this section from the rest of the page.

This is fine. As long as its accomplished with CSS. This is common sense when you consider the page without styling. Here’s what’s wrong with littering such markup with b and i tags:

  • In making such stylistic decisions, our intent is not to emphasize it over content. We just want to establish a particular look and feel to the visual presentation. We do not intend for such sections to be read louder, higher or faster. We just think a certain look is nicer.
  • These additional b and i tags add additional page weight.
  • These additional b and i tags will not convey emphasis to assistive technology anyway.

Desire bold and italic rendering?

In some cases, you may want to render text bold and italic. To determine how to markup such text, care must be taken to determine what level of emphasis is intended. This should drive the decision – em for emphasis, strong for greater emphasis. The following approach could also be safely used:

  • <em class=”bold”>I am only emphasized, but I am bold and italic!</em>
  • <strong class=”italic”>I am greatly emphasized, and I too am bold and italic!</strong>

abbr and acronym

Ever wonder why so many of the HTML tags support the title attribute? Here are two good reasons:

  • <abbr title=”The Microsoft Network”>MSN</abbr>
  • <acronym title=”American Standard Code for Information Interchange”>ASCII</acronym>

Using the markup in this manner assists users who may not be familiar with the term.

  • Visual browsers will usually provide an unobtrusive but very effective method to assist these users by underlying the term with a dashed line and providing a tool tip when the user moves the mouse over the term. Note that IE only currently supports this for acronym but not abbr. This is provided for information, but it should not be used in the consideration of the markup, since other browsers support this correctly across both, and since simply always using acronym because of this is wrong for assistive technologies.
  • Assistive technologies will recognize the abbr element and it will spell out the term instead of attempting to pronounce it: M-S-N, not “missen”.
  • Likewise, they will recognize the acronym element and pronounce the text, instead of spelling it out: “ascii”, not A-S-C-I-I.

address

  • Intended for signifying content that represents information such as addresses, electronic signatures, lists of authors, etc.
  • By default, most browsers will render as a block element with containing text in italics, which can be overridden with CSS.

cite

The purpose of cite is to reference a citation to a source, for example, a web site, author or article.

  • By default, most visual browsers will render containing text in italics. Of course, this can be overridden with CSS.
  • In quotations, or when providing attribution text to a module, we should use cite.
  • By being consistent in this usage, we gain a valuable ability, the ability to extract this information from the pages as needed later or as a way of providing alternate, source-based navigation.

code

  • Intended for providing code examples in the page.
  • Default rendering can be overridden with CSS.

del

  • Intended for signifying content that has been deleted.
  • Default rendering can be overridden with CSS.
  • What do assistive technologies do with this?

dfn

  • Similar to abbr and acronym, but intended for defining a term.
  • Default rendering can be overridden with CSS.
  • What do assistive technologies do with this?

div

  • Generic block container for content.
  • Ignored by both visual browsers (unless CSS defines styles for them) and assistive technologies.
  • Provides no semantic meaning to the structure.

ins

  • Intended for signifying content that has been inserted.
  • Default rendering can be overridden with CSS.
  • What do assistive technologies do with this?

kbd

  • Intended for signifying text that the user should enter with the keyboard.
  • Default rendering can be overridden with CSS.

p

  • Intended for signifying a logical block of text as a paragraph.
  • By default, most visual browsers will render with appropriate spacing above and below the paragraph, and in some cases, indenting the first line. Of course, this can be overridden with CSS.
  • Assistive technologies will utilize paragraph tags as cues for pausing the reading of text contained within.
  • Note: unlike the deprecated BR tag, P denotes a semantic meaning to the content, so it is used. BR is really simply a presentational line break instruction, which is why it should not be used. Additionally, P provides a style hook for containing text, while BR, due to its essential character nature, cannot contain anything.

pre

  • Containing element for preformatted text. Enclosed text is usually rendered while preserving source white space and line breaks.
  • Unsure of assistive technology interpretation of this.

samp

  • Intended for providing examples of program or script output in the page.
  • Usually, visual browsers will render text in a monospaced, serif font, which can be overridden with CSS.

span

  • Generic inline container for content.
  • Ignored by both visual browsers (unless CSS defines styles for them) and assistive technologies.
  • Provides no semantic meaning to the structure.

var

  • Intended for providing program parameters or variables in the page.
  • Usually, visual browsers will render text in italics, which can be overridden with CSS.

This post is closed to new comments.

About this page

This page contains a single post from Daniel Boerner's blog, of which Boot Camp + Windows Vista = no more Airport Extreme reboots is the latest post.

Are there more posts like this one?

Possibly. Within this blog, this post is categorized under webdev and it was posted on August 15, 2004. Those would be good places to start looking for related posts.

Next post (newer)

Previous post (older)