Heading Elements, Semantics and the Spec | February 7, 2005

Many people have taken to using the h1 element to describe the site a particular page belongs to. This usually involves wrapping a h1 tag around the site name, hiding the containing text from browsers using CSS and possibly swapping in the site logo using an image replacement method.

I can see the logic to this. Many people see a website much in the same way they see a book. Each page on a site is like a chapter in a book, so each page gets it’s own headline. However that chapter is part of a collection of chapters and it’s important to know what that collection is called. How else would you be able to buy it off Amazon? In the same way, people feel that a webpage is naturally part of a collection of pages called a website, so the main heading of all the pages on a site should be the name of the site.

The spec says “A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.”

The purpose of the heading element therefore is to describe the structure and content of the current page, not to describe the structure of the site or how this particular page fits into a larger hierarchy. By using the H1 element to link a document to a group of documents the web author is almost trying to create their own meta data. However that isn’t necessary as HTML already provides us with the means of doing this using the link element.

Most web authors are using the H1 element this way in order to increase the meaning of the document. However in many respects they are actually doing the opposite and adding superfluous information. For instance, much in the same way that a sighted user will scan a pages headings looking for information, an experienced screenreader user may have all the level one headings read aloud. This is because they know that level one headings should accurately describe what a page is about. If the site author has decided to use the level one headings to describe the name of the site instead, screenreader users may naturally assume that the pages don’t contain the info they are looking for and go elsewhere.

A lot of web authors also mistakenly believe that there should only be a single h1 per page. This makes sense if all the information on the page is about a single subject but it makes less sense if the page has groups of unconnected information. For instance, this article is primarily about headings, however the side bar contains unrelated information. Unfortunately because the sidebar doesn’t have it’s own h1, this information is structurally considered part of the headings article, even though it has nothing to do with it.

Posted at February 7, 2005 8:47 AM

Comments

Andrius Mazeika said on February 7, 2005 9:03 AM

I’ve been wondering about this issue for quite some time. Wrapping page title in an h1 element always doesn’t seemed a good idead to me. Now I am confident about it. Thanks for a great article, Andy ;)

Rob McMichael said on February 7, 2005 9:07 AM

Hmm interesting subject.
I have always though the page title can either be used for this of in collaboration of this.
I tend to use the H1 tags to describe the highest level header on a page and then work down. I have never really though about a site wide hierarchy.

Perhaps it is because of the SEO that has distracted H1 tags from their original purpose?

Chris Vincent said on February 7, 2005 9:50 AM

I’ve always thought about the subject in very much the same manner you describe. However, I still use <h1> to describe the name of the site a lot of the time.

Why? Because it’s the best semantic alternative I think we currently have. There’s the <title> element, but it doesn’t display within the document (unless you employ some tricks that don’t work in many browsers).

Also, viewed without CSS, <h1> is typically rendered in a large font or other another prominent way (for browsers without the ability to display varying text sizes). This makes it suitable for displaying the name of the site in such circumstances without adding presentational markup.

I agree that <h1> isn’t especially suited for the job, but for now it’s the best option (I think).

adam said on February 7, 2005 9:55 AM

You’ve made me think again about this. I’ve taken the view of marking up the site name/masthead in an h1 tag, as Rob says this is primarily from an SEO stand point.

Would using an h1 tag in this way lead to a good semantic text version of a site?

John Oxton said on February 7, 2005 10:16 AM

That is me all over, the question is does that mean the name of the site is unimportant and/or how should we mark-up the site name? Or should we not bother.

Patrick said on February 7, 2005 10:47 AM

I completely agree, on most part. You should think of an HTML page as a standalone document, even if it is (and usually will be) one document within a collection.

I think a book analogy can actually a good one, but instead of thinking of the h1 as being the name of the book, it should be seen as the heading of the chapter, with h2-h6 as the subheadings.

One thing I would disagree with is “A lot of web authors also mistakenly believe that there should only be a single h1 per page.” I actually tend to believe that there should only be a single h1 per page, but I don’t consider myself to be mistaken (it may be debatable, but it’s certainly not cut-and-dry). The h1 is the heading of the document, so it doesn’t make sense to me to use it more than once. Navigational areas are supplementary to the document, so fit perfectly well under the h1 heading.

See http://www.w3.org/QA/Tips/Use_h1_for_Title
“<h1> is the HTML element for the first-level heading of a document

Anne said on February 7, 2005 11:11 AM

The only browser in which displaying the TITLE element does not work is Internet Explorer. That is a single browser, not “many browsers” as said in a previous comment.

The screen reader user argument of the main article does not really apply I guess. After all, they are browsing the current web, not “utopia web”.

About the sidebar and such. We indeed need separate markup for that. Web Applications 1.0 (sometimes referred to as HTML 5.0) is going to address these problems. The solutions will also be ported to the XHTML 1.0 namespace although that is not really useful for the current web obviously, since Internet Explorer is out there too. (With market share.)

For now, using DIV elements to separate the sidebar from the main contents and using DIV elements to separate the comments from the article, the articles from each other on the frontpage, et cetera, is good enough.

Anne said on February 7, 2005 11:17 AM

The problem, of course, is that HTML was never intended for web applications, documents with sidebars, comments and online discussion forums. (Et cetera.) HTML was intended to publish scientific documents online.

This problem has also never been addressed by the W3C and that is why most of these “what semantics can I use here” discussions arise.

Kev said on February 7, 2005 11:48 AM

I constantly waver on this, sometimes using h1 and sometimes creating a div for the site-wide stuff.

I see what you mean about semantic correctness of not using it but I think I’ll carry on using it until the spec gives us a decent alternative.

Matthew Pennell said on February 7, 2005 1:38 PM

I guess the ideal solution would be to use [part of] the <title> tag for compliant browsers, use Javascript to grab the title and display it for IE, and a noscript fallback for IE users with JS disabled.

head, title { display: block; }

Keith Bell said on February 7, 2005 1:38 PM

For the same reason as Patrick gave, I have often thought that there should be only one H1 element on a page. However, Clause 7.5.5 of HTML 4.01 says “The following example shows how to use the DIV element to associate a heading with the document section that follows it” and from the example that follows, there seems a clear implication that more than one H1 element is permissible. According to the example, DIVs would be used to separate the sections to which each H1 element relates.

Hmmm…

soxiam said on February 7, 2005 2:56 PM

I would also like to chime in and echo others regarding single H1 per page. I think one of the reason why this practice has become accepted as the ‘right’ way is due to many SEO articles out there that point out one H1 per page is good for increasing the ‘weight’ of H1 when the page is getting crawled. I don’t know how much it helps or if there’s any real evidence that supports this theory, but you know how it is with SEO, it has tendency to turn theories into bible when the truth is shrouded in mystery.

Jesse Pearlman Karlsberg said on February 7, 2005 3:37 PM

So far the only alternative to using h1 for a page title or website name or something is to use styling and javascript to display the title tag. This won’t work, however for cases when the title is particular to the page, as is the case for this page (“Andy Budd::Blogography: Heading Elements, Semantics, and the Spec”), while the text or image at the top of the page is general and the same on every page in the site.

So, there needs to be a different solution. If what appears at the top of the page isn’t exactly an h1, and isn’t exactly a title, what is it? Is there a tag for that?

I feel like I’ve seen this discussion play out a few times, without any satisfying resolution, and I think that is why most web programmers continue to use h1 for their logos. I would love to see an acceptable alternative, myself.

Kim Siever said on February 7, 2005 4:18 PM

I agree with Jesse. People do to titles what they do to h1 tags, they include the site title in them, so styling the title tag isn’t any better of an option.

Jeff Adams said on February 7, 2005 4:33 PM

I would like to see Andy put forth some suggestions as to how he would solve this.

On this site it looks like you rely on the title tag and an image for each page.

Andy Budd said on February 7, 2005 5:49 PM

Patrick said

I think a book analogy can actually a good one, but instead of thinking of the h1 as being the name of the book, it should be seen as the heading of the chapter, with h2-h6 as the subheadings.

That’s what I was implying although I guess I wasn’t clear enough. On the subject of multiple H1’s, the spec says

Some people consider skipping heading levels to be bad practice. They accept H1 H2 H1 while they do not accept H1 H3 H1 since the heading level H2 is skipped.

Which seems to imply that multiple h1’s are acceptable.

Roger Johansson said on February 7, 2005 7:42 PM

Interesting topic, Andy, and a difficult one. I’m with the “use the highest level heading for the title of the current document” gang. It seems most logical to me. A possible exception is a site’s home page, where the site name may be relevant enough to go in a h1 element.

I also don’t think putting the site name in an h1 element is going to improve search engine rankings. Think about it: if someone searches for your site/company name, they already know about you, and the search is specific enough that they will find you anyway. You want your highest level headings to describe what your documents are about, so you can be found by those searching for what you are offering.

To make sure search engines pick up your company or site name, you can use the title element to include it on every document if you want to (along with the title of each document, of course).

Zelnox said on February 7, 2005 8:28 PM

What Keith Bell pointed out is how I see it. I usually wrap headers with divs. But I can’t think of a situation where I’d have two level one headers in a same page yet. If it happened, I think I would split them into two documents even if it was valid. I suppose it depends on how the document is conceptually divided.

AkaXakA said on February 7, 2005 8:41 PM

A lot of web authors also mistakenly believe that there should only be a single h1 per page.

So…You could simple use 3 H1’s per page: one for the sitename, one for the content and one for the sidebar.

Right?

PS. I’m getting a simplebits Deja-Vu :)

Adam said on February 7, 2005 9:33 PM

What browsers that most of us care a great deal about are there that can’t understand arbitrary markup within HTML? I think that there are vanishingly few that would completely muck things up.

So why don’t we do the following:

&lt;html xmlns="..." xmlns:ab="http://schema.andybudd.com/extensions/html"> ... &lt;body> &lt;ab:sitename> &lt;a rel="siteroot" href="/"> Andy Budd &lt;/a> &lt;/ab:sitename> &lt;ab:sectionname> &lt;a rel="sectionroot" href="/Blogograph"> Blogography &lt;/a> &lt;/ab:sectionname> ... &lt;/body> &lt;/html>

I haven’t actually tested this but I think it could work. It might cause problem validating but I think we should be validating xhtml with schemas anyways so we can do things like this.

Just a thought.

Adam said on February 7, 2005 9:48 PM

Bah. The preview was rendering things differently than what the final result was. The markup I meant was:

html xmlns=”…” xmlns:ab=”http://schema.andybudd.com/extensions/html”>

head/>

body>

ab:sitename

a href=”/”> Andy Budd /a>

/ab:sitename>

ab:sectionname>

a href=”/Blogograph”>Blogography /a>
/ab:sectionname>

/body>

/html>

I hope this is legible. Basically I mean that we should just agree on some extensions and use CSS to style them.

Schultzy said on February 7, 2005 9:54 PM

I said this on Josuaink blog when he had a simalar article.
I think that it should be a case that we can do what we like.

Taking in mind what would you do if one page had the same thing on it twice with a different language?
Would you use another h1 tag?
I would..

I feel that thinking of it as a book is a good idea say you had a book…

The title of the book is the H1 tag

what if you where running two volumes of the book or something.

We should think before acting…

Michael Newton said on February 8, 2005 4:55 AM

Sticking to the book analogy, I tend to treat each page as a chapter. I don’t even put the site title on the page (nor would I repeat the title of a book at each new chapter.)
Of course, it’s there in the title tag, just like it would be in a book: at the top of each window, but not a part of the content. Isn’t that what the title tag is for?
Of course, I am not a web designer, since I have no artistic talent. Just someone who has to build the occasional web page, and likes clean, semantic HTML.

Benvolio said on February 8, 2005 9:08 AM

I like to think about how the use of h1-6 tags can help enhance user experience rather than their semantic or SEO benefits.

“However that chapter is part of a collection of chapters and it’s important to know what that collection is called.”

For me headings offer a way to break up sections of a document to increase scan-ability and readability. It offers the user a guide to the hierarchy of the document and the significance of each section. Go here for an example I whipped up.

In an unstyled document, the user is presented with a large block of text. Its difficult to discern any hierarchy and also difficult to read. By simply adding some basic structure the document instantly becomes more readable, scannable and a clear hierarchy has emerged.

In terms of chapters, I think you guys are not giving the users enough credit. Each time I come to a new chapter in a book, I don’t need to be told that this chapter called ‘Setting Sun’ is a child of ‘Book II’ which is in turn a child of the whole book called ’ The Sun Also Rises’. I have enough mental capacity to remember the title of the book. So what’s important is that we provide clear location cues to the user so they are aware that they’re viewing the sub-section named ‘Setting Sun’ within section ‘Book II’ on the site called ‘The Sun Also Rises’

Stuart Langridge said on February 8, 2005 10:14 AM

Benvolio: you’re right, when you’re reading a book you don’t need to be reminded of a title. If it’s your book, and you’re reading it from beginning to end.
Try an experiment. Walk into a library, and pick up a random book. Don’t look at it; don’t read the cover. Open it to a random page and read the page. Now you do not know what the title of the book is. You might find the book title at the bottom of the page (like the title tag on a web page, as suggested by Michael Newton). If you actually wanted to know about the book, though, you’d just close it and look at the front cover. A link to the front page of the site is analagous, and of similar importance, and should be of similar ease to perform, and the way to express that “this is important and easy to do” semantically is to use an h1.

Björn said on February 8, 2005 11:40 AM

Great article. Opens mind for a lot of people.

Only one point I don’t agree. I think there should only be one h1 heading. This provides the topic of the actual page. But then it could be followed by multiple level 2+ headings.

R. Marie Cox said on February 8, 2005 4:24 PM

Dave Shea @ Mezzoblue recently linked to a Semantic data extractor put out by the W3C. When the extractor presents a page with multiple h1’s, it doesn’t give precedence to the first instance of an h1 tag — or even care how many h1 elements are used — and instead follows a basic outline-type hierarchy. So a properly structured document would look like:

h1
- h2
- - h3
- h2
h1
- h2

The lesson, according to the extractor, is that it’s less the number of times a particular heading element is used and more the attention paid to hierarchy and grouping respective to the other heading elements within the document itself.

So, my question is: if heading elements shouldn’t be repurposed for navigational usage then what, if any, HTML elements are better suited to describe site-wide hierarchies/orientation semantically?

Emil Virkki said on February 8, 2005 7:26 PM

Like AkaXakA said: If we can use multiple h1-elements on a page, and use an own h1 for the content and the sidebar, then why can’t we use an own h1 for the nav?

By doing so, we could put the website title as the h1 of the navigation list. This would make sense, because the nav isn’t a part of the content of the page, but is a sort of “table of contents” of the site. And using the website title as the heading of the table of contents would be a logical option, unless of course we want to use “table of contents” as the heading :)

Richard said on February 10, 2005 10:45 PM

A lot of good points mentioned here. However, sometimes a web page isn’t read by the user as part of a larger site but instead it is perceived as an individual unit. This is especially the case in relation to articles. If I’m doing research for a paper and I’ve collected all these different articles from different sites (but all on the same topic) and printed them out, then I think it’s important that I know which site they come from (especially when quoting and referencing).

Think of a Bibliography - Author, Title, Publication, Date. We need this infromation and it has a semantic meaning…

Just a thought :o)