Semantic Coding | May 7, 2004

The rise of web standards has also seen the rise of semantic coding. We’ve had it drummed into us a thousand times that <h1>’s are for headings and not for big text, that <blockquote>’s are for quoting text and not for indenting. We’re all so used to this now that we do it by default (don’t we?).

Semantic markup has been sold to us very well. The dream is that one day, other computer systems will see meaning in your code. Google will look at your headlines and think “hmm, that’s a headline, it must be important”. Speech browsers will see that something is wrapped in an <em> and will dutifully emphasise it. Using (X)HTML elements in the correct, semantic way is very laudable and long may it last.

If you’re a CSS author, you’ve probably used container elements to help control the visual display of your pages. You’ll wrap a <div> or a <span> around another element (or group of element) to act as a hook for your style. By doing this you’re adding meaningless markup to your code for display reasons. This makes you feel bad, so you’ll try to give these hooks some meaning. You’ll put all your branding code inside a <div> called branding and your main content will go inside a <div> imaginatively called mainContent.

You relax into your chair feeling glad that you’ve bought order back into the world. However, if you are anything like me (for your sake I hope you’re not) there will always be one or two rogue elements. For instance, you’ve decided to go for a two column layout. What on earth do you call their parent <div>’s? You could call them col1 and col2 or leftCol and rightCol but that just seems wrong. What if one day, via the magic of CSS, you decide to switch their position. Chaos would ensue and you’d be forced to hang up your web standards spurs in shame. Of course you know that you’ll never actually change the order, but it still niggles.

The question that springs to my mind is “does it matter?” Despite what we’d like to think, not everything has semantic meaning. Some things are purely presentational and sometimes it’s going to be impossible to completely separate meaning from display. While it’s reasonable to expect that current and future computer systems will be able to understand the meaning behind a <cite> element, is it reasonable to expect them to understand each individual authors personal semantic vocabulary? Because let’s face it, there really is no standard way of semantically identifying page elements outside the regular (X)HTML tags. As such, pretty much all this “semantic” information is meaningless. It makes up happy (and to an extent makes development easier) but it’s not really intended for anybody other than ourselves.

Posted at May 7, 2004 5:48 PM

Comments on: Semantic Coding

Jump to the comment form

 speach bubble Comment

(X)HTML tags should all have semantic meaning. Class/Div names don’t and won’t. With respect, I think your confusing semantic meaning with sensible class/div naming for easier code maintenance.

Classes and divs, as you say, are hooks. Hooks to attach presentation. Presentation only, no meaning.

Having said that, I wonder if em and i actually carry any weight with search engines etc., unlike hn which obviously do.

Also it could be argued that presentation does have some meaning. Others, such as Anne van Kesteren, have argued that divs are semantic, in the same way ‘section’ will be semantic.

Posted by: Rob Reed at May 7, 2004 6:23 PM

 speach bubble Comment

I think it really boils down to who you think will be looking at your code. You can get the browser to render your code the same way using CSS. So, ultimately, which is more maintainable? DIV soup or pure semantic? Sometimes going as semantic as possible can be more difficult and time consuming. Its hard to implement ideals into reality, and sometimes there just isn’t enough time or resources.

But in my humble opinion, pure semantic markup is a goal to strive for with the realisation that it can rarely be attained in the current state of brower technology.

Posted by: Dave at May 7, 2004 6:50 PM

 speach bubble Comment

The thing that’s been bugging me is italicizing foreign language words/phrases. I think I’ve settled on <i lang=”fr”>. <i> has an evil stigma to it, but it’s purely presentational and that’s what you want. Speaking browsers will see the lang and adopt a French accent, while visual browsers will use italics. You could use <span class=”foreign” lang=”fr”>, but, as you said, span has no semantic meaning. Google won’t have any clue that you’ve used a foreign word. Must we add every semantic phrase to the XHTML vocabulary?

Posted by: David Ely at May 7, 2004 6:58 PM

 speach bubble Comment

I think this is why some people want XML to become a web language and not just XML masquerading as HTML. Instead of making a div something it’s semantically not, you can say

this is my content
hello from the left column!

And voila, you get your meaning back.

Posted by: Jay at May 7, 2004 8:00 PM

 speach bubble Comment

Guess <code> didn’t catch those tags..

so again, you can say:

<maincolumn>this is the main column&lt/maincolumn>

etc.

Posted by: jay at May 7, 2004 8:05 PM

 speach bubble Comment

Comrade

Standards/CSS aware designers and developers, develop sites that are more semantic and therefore ‘purer’ than those built on ‘car-crash’ mark-up.

But as we designers travel further on our semantic/standards journey, we are often in danger of becoming overly puritanical, beating our own backs with twigs until we bleed, just for using an ‘unnecessary’ div or span.

It’s the end results that count. Benefits for us, benefits for our clients and benefits for our clients’ customers.

Design IS compromise by definition, and accessible/standards design is for the widest number of people. Leave puritanism for the zealot minority.

Posted by: Malarkey at May 7, 2004 9:11 PM

 speach bubble Comment

Matthew Thomas has posted on this recently (Which I disagreed with although had to conceed), in an attempt to stop people blindly replacing i and b with em and strong. He also talks about non-human readers of semantic code (search engines etc.)

Posted by: Joseph Lindsay at May 7, 2004 9:12 PM

 speach bubble Comment

I’ve started my own article on semantic markup today, without knowing you already posted one. Tonico Strasser mailed the entry to the Plone UI list.

And i was cheeky enough to write an answer, claiming that your article makes some mistakes: Semantic coding: an answer for Andy Budd.

I hope you’ll find the time to answer as well, otherwise i’ll bug you via e-mail ;)

Posted by: Michael Zeltner at May 7, 2004 9:33 PM

 speach bubble Comment

Good points, Andy. I think you’re right in that the specifically named <div>s and <span>s can be meaningless — but as you pointed out earlier in the post, there are good reasons for using semantic markup for everything else: Search engines, assistive software, devices that display heading, lists, etc. as you had intended — it’s for those reasons you mentioned that that I believe makes the effort of being semantic when possible, worth it.

Although I will agree that being 100% semantic in 2004 is impossible ;-)

Posted by: Dan Cederholm at May 7, 2004 9:50 PM

 speach bubble Comment

we are often in danger of becoming overly puritanical, beating our own backs with twigs until we bleed, just for using an ‘unnecessary’ div or span.

This I agree with. 100% semantic code might be a goal worth attaining, but we shouldn’t be too harsh on ourselves, at least not until the time comes when we can achieve this without losing sleep.

Posted by: jim at May 7, 2004 10:08 PM

 speach bubble Comment

When reading this article, all I could think of is:
What is the meaning of life?

Posted by: Jon Hermansen at May 7, 2004 10:12 PM

 speach bubble Comment

It will all be better with XHTML 2 which brings in some new elements to take place of DIVs which have semantic meaning, such as <section>. Class names are for humans only, though naming is still important in case someone else is tasked with editing your design. Good commenting and structure are also important. It’s just basic coding really. Nothing more, nothing less.

Posted by: Bruce at May 7, 2004 10:56 PM

 speach bubble Comment

You had a bad coding day, Andy? ;)

Posted by: Aleksandar at May 8, 2004 1:39 AM

 speach bubble Comment

Giving such names to DIV elements is just plain evil. I know it happens. However, that is not a reason to follow.

See also: http://webstandardsgroup.org/features/anne-van-kesteren.cfm#semantic-meaning and http://annevankesteren.nl/archives/2004/03/a-hole-in-html

Posted by: Anne at May 8, 2004 12:15 PM

 speach bubble Comment

This post inspired some thoughts.

Posted by: David House at May 8, 2004 5:34 PM

 speach bubble Comment

Google will look at your headlines and think “hmm, that’s a headline, it must be important”.

I’d say they’re doing it now, and have for a long time.

An interesting bit of info on that topic was posted in a forum I frequent; it’s linked via my blog, via the URL attached to my name below (so as to avoid long-ur-itis).

Posted by: Mike P. at May 8, 2004 10:40 PM

 speach bubble Comment

so calling it “leftColumn” and “rightColumn” doesn’t make sense… of course -

but how about -

“mainContent”
“blogPosts”
“/blogPosts”
“blogComments”
“/blogComments”
“/mainContent”

“secondaryContent”
“recentBlogPosts”
“/recentBlogPosts”
“externalLinks”
“/externalLinks”
“ads”
“/ads”
“/secondaryContent”

naming divs based on position on screen doesn’t make sense - naming them based on their content does. maybe i’ve just had too many beers?

Posted by: charles s. at May 9, 2004 7:30 AM

 speach bubble Comment

Because let’s face it, there really is no standard way of semantically identifying page elements outside the regular (X)HTML tags.

Surely there is actually a standard: XML namespaces. In theory if you wanted to you could include semantically-rich elements from, for example, the DocBook standard in your XHTML pages, rather than use divs. Although I realise that many current browsers would have trouble with this the standard is out there.

Combine this with a pinch of XSLT and you get geeks like my coming over all faint with excitement.

Posted by: mjr at May 9, 2004 1:04 PM

 speach bubble Comment

It would be kind of interesting is someone came up with a sort-of “standard” for class and id names in typical settings. For example, one could develop a standard set of names for blogs, one for photo galleries, etc.

While the chances of widespread adoption for something like this is small, it would at least be very beneficial within an organization. For example, you boys at Message could all agree to name things the same way on each project, ensuring that when one dude has to look at the another one’s code, you’ll know what the hell he/she was talking about. It’d almost be like a corporate style guide for XHTML code…

Posted by: Jeff Croft at May 9, 2004 2:55 PM

 speach bubble Comment

Sensible naming should be used whenever possible. If you have a typical two column layout, one column is usually the main content, while the other column is meta-content associated with the main content, or the site as a whole.

For example, on my site, I have the right column in a DIV with an ID of “news” (it contains the latest news on the site), while the main content is in a DIV called “summary”. The DIVs themselves are semantic in the sense that they do have meaning. They are a logical grouping of similar elements. Just because you can arbitrarily ID/CLASS your DIV (or span) doesn’t make it presentation only.

Today Google may not care about DIVs, but it could be made to. Just like it weighs text near the beginning of the document higher than that at the end (offset by a myriad of other things), it could consider elements that are grouped in a DIV to be of similar (relational) value.

Posted by: CM Harrington at May 9, 2004 3:57 PM

 speach bubble Comment

Currently the only two names, (set by id or class), that would be of any real use as a standard, are “header” and “footer”.

Given that we’re only talking about machines reading and parsing pages, it would be most helpful for a machine to discount the contents of those elements for the purpose of ‘understanding’ a page etc.

Posted by: pid at May 9, 2004 4:00 PM

 speach bubble Comment

You’ll wrap a <div> or a <span> around another element (or group of element) to act as a hook for your style. By doing this you’re adding meaningless markup to your code for display reasons. This makes you feel bad, so you’ll try to give these hooks some meaning.

Actually… this whole issue could be resolved by simply not feeling bad about it. 8^)

If I want to wrap a <div> around a section of markup to create a hook to lay a presentation style on it, I’ll do so while feeling good about it.

I don’t expect anyone to be able to define a “semantic” markup language that can possible cover every aspect of what I want to do as a designer. The only solution is to go pure XML and create my own tag definitions to do this on a per project basis. Given how much gets lost going that route on a global scale, I see no problem with adding divs and spans attached with classnames to solve the design and presenation problem.

Posted by: Andrei Herasimchuk at May 9, 2004 6:42 PM

 speach bubble Comment

Wel, the most heared reason as to why people should write semantic markup is that it gets you higher google rankings…

There are http://www.seochallenge.com/ that are willing to allow people to come up with new subversive measures to use alongside semantic coding to get the highest ranking for nigritude ultramarine…read more on the slashdot story on nigritude ultramarine.

This kind of contest removes the effectiveness of google, thus making the point of writing semantic markup to get a higher ranking a bit weaker.

Posted by: AkaXakA at May 9, 2004 9:39 PM

 speach bubble Comment

Wel, the most heared reason as to why people should write semantic markup is that it gets you higher google rankings…

There are http://www.seochallenge.com/ that are willing to allow people to come up with new subversive measures to use alongside semantic coding to get the highest ranking for nigritude ultramarine…read more on the slashdot story on nigritude ultramarine.

This kind of contest removes the effectiveness of google, thus making the point of writing semantic markup to get a higher ranking a bit weaker.

Posted by: AkaXakA at May 9, 2004 9:39 PM

 speach bubble Comment

I don’t think DIV does not have semantiv meaning as it is mearking section of HTML. This section can be named Menu (<div id=”menu”>) or some other, but still it’s section. Ok, we can have <section> in XHTML 3.0, but that’s just the same thing.

Posted by: dusoft at May 10, 2004 12:05 AM

 speach bubble Comment

to me, the simple fact that you’re naming a div ‘mainContent’ and I do the same, AND we’ve not personally made an arrangement on this, is a great step towards the idea of a ‘standard’ way of naming elements. Not to speak of ‘mainMenu’, ‘header’ and ‘footer’…

It’s source code and as such is going to be read by someone else, maintaining it later, or someone at css-d helping us, or even by yourself a couple of months later.

It doesn’t have to be semantic web ‘semantics’… just some common-sense-forward-thinking ‘meaning’.

It’s just community de-facto semantics, it’s no big deal, but it’s Just Good.

Posted by: manuel at May 10, 2004 2:01 AM

 speach bubble Comment

Maybe it is that last 2% which is causing the trouble. The rest is quite simple to semantically markup and, to reiterate, it is not just computers that will need that meaning but co-workers trying to figure out quickly which block of code is the header and which is the sidebar.

As ever, balance.

Posted by: Paul watson at May 10, 2004 4:05 PM

 speach bubble Comment

I made the comment somewhere else (I can never remember where I say what), but I get the feeling that web coding has really grown from other programming languages and as Jeff Croft mentioned it would certainly be more efficient for a design firm to take it one step further.

If html could be short-handed using a more OO approach, like add li:3(“Blog”,”Archive”,”About”) or something. Then designers could focus more attention on css and layout.

This would really facilitate an expansion on the web, if computers could rely on certain structures.

Just a thought.

Heath

Posted by: Heath at May 10, 2004 11:13 PM

 speach bubble Comment

i’d like to see more standards in the names of elements. I call the page area #stage where as other people call it different things. This will lead to better portability as like Zen Garden (only better) sites could be re-styled with ease.

and FYI google does look at H1 tags and say “this is a heading” its SEO 101 :)

Posted by: mark at May 11, 2004 9:46 AM

 speach bubble Comment

Google is pushing toward semantinc mainly because 40% of G queries are not sponsored (yet), what a waste! :-)

Posted by: Omiod at May 12, 2004 4:35 PM