Semantic Coding | May 7, 2004

The rise of web standards has also seen the rise of semantic coding. We’ve had it drummed into us a thousand times that <h1>’s are for headings and not for big text, that <blockquote>’s are for quoting text and not for indenting. We’re all so used to this now that we do it by default (don’t we?).

Semantic markup has been sold to us very well. The dream is that one day, other computer systems will see meaning in your code. Google will look at your headlines and think “hmm, that’s a headline, it must be important”. Speech browsers will see that something is wrapped in an <em> and will dutifully emphasise it. Using (X)HTML elements in the correct, semantic way is very laudable and long may it last.

If you’re a CSS author, you’ve probably used container elements to help control the visual display of your pages. You’ll wrap a <div> or a <span> around another element (or group of element) to act as a hook for your style. By doing this you’re adding meaningless markup to your code for display reasons. This makes you feel bad, so you’ll try to give these hooks some meaning. You’ll put all your branding code inside a <div> called branding and your main content will go inside a <div> imaginatively called mainContent.

You relax into your chair feeling glad that you’ve bought order back into the world. However, if you are anything like me (for your sake I hope you’re not) there will always be one or two rogue elements. For instance, you’ve decided to go for a two column layout. What on earth do you call their parent <div>’s? You could call them col1 and col2 or leftCol and rightCol but that just seems wrong. What if one day, via the magic of CSS, you decide to switch their position. Chaos would ensue and you’d be forced to hang up your web standards spurs in shame. Of course you know that you’ll never actually change the order, but it still niggles.

The question that springs to my mind is “does it matter?” Despite what we’d like to think, not everything has semantic meaning. Some things are purely presentational and sometimes it’s going to be impossible to completely separate meaning from display. While it’s reasonable to expect that current and future computer systems will be able to understand the meaning behind a <cite> element, is it reasonable to expect them to understand each individual authors personal semantic vocabulary? Because let’s face it, there really is no standard way of semantically identifying page elements outside the regular (X)HTML tags. As such, pretty much all this “semantic” information is meaningless. It makes up happy (and to an extent makes development easier) but it’s not really intended for anybody other than ourselves.

Posted at May 7, 2004 5:48 PM

Comments

Rob Reed said on May 7, 2004 6:23 PM

(X)HTML tags should all have semantic meaning. Class/Div names don’t and won’t. With respect, I think your confusing semantic meaning with sensible class/div naming for easier code maintenance.

Classes and divs, as you say, are hooks. Hooks to attach presentation. Presentation only, no meaning.

Having said that, I wonder if em and i actually carry any weight with search engines etc., unlike hn which obviously do.

Also it could be argued that presentation does have some meaning. Others, such as Anne van Kesteren, have argued that divs are semantic, in the same way ‘section’ will be semantic.

Dave said on May 7, 2004 6:50 PM

I think it really boils down to who you think will be looking at your code. You can get the browser to render your code the same way using CSS. So, ultimately, which is more maintainable? DIV soup or pure semantic? Sometimes going as semantic as possible can be more difficult and time consuming. Its hard to implement ideals into reality, and sometimes there just isn’t enough time or resources.

But in my humble opinion, pure semantic markup is a goal to strive for with the realisation that it can rarely be attained in the current state of brower technology.

David Ely said on May 7, 2004 6:58 PM

The thing that’s been bugging me is italicizing foreign language words/phrases. I think I’ve settled on <i lang=”fr”>. <i> has an evil stigma to it, but it’s purely presentational and that’s what you want. Speaking browsers will see the lang and adopt a French accent, while visual browsers will use italics. You could use <span class=”foreign” lang=”fr”>, but, as you said, span has no semantic meaning. Google won’t have any clue that you’ve used a foreign word. Must we add every semantic phrase to the XHTML vocabulary?

Jay said on May 7, 2004 8:00 PM

I think this is why some people want XML to become a web language and not just XML masquerading as HTML. Instead of making a div something it’s semantically not, you can say

this is my content
hello from the left column!

And voila, you get your meaning back.

jay said on May 7, 2004 8:05 PM

Guess <code> didn’t catch those tags..

so again, you can say:

<maincolumn>this is the main column&lt/maincolumn>

etc.

Malarkey said on May 7, 2004 9:11 PM

Comrade

Standards/CSS aware designers and developers, develop sites that are more semantic and therefore ‘purer’ than those built on ‘car-crash’ mark-up.

But as we designers travel further on our semantic/standards journey, we are often in danger of becoming overly puritanical, beating our own backs with twigs until we bleed, just for using an ‘unnecessary’ div or span.

It’s the end results that count. Benefits for us, benefits for our clients and benefits for our clients’ customers.

Design IS compromise by definition, and accessible/standards design is for the widest number of people. Leave puritanism for the zealot minority.

Joseph Lindsay said on May 7, 2004 9:12 PM

Matthew Thomas has posted on this recently (Which I disagreed with although had to conceed), in an attempt to stop people blindly replacing i and b with em and strong. He also talks about non-human readers of semantic code (search engines etc.)

Michael Zeltner said on May 7, 2004 9:33 PM

I’ve started my own article on semantic markup today, without knowing you already posted one. Tonico Strasser mailed the entry to the Plone UI list.

And i was cheeky enough to write an answer, claiming that your article makes some mistakes: Semantic coding: an answer for Andy Budd.

I hope you’ll find the time to answer as well, otherwise i’ll bug you via e-mail ;)

Dan Cederholm said on May 7, 2004 9:50 PM

Good points, Andy. I think you’re right in that the specifically named <div>s and <span>s can be meaningless — but as you pointed out earlier in the post, there are good reasons for using semantic markup for everything else: Search engines, assistive software, devices that display heading, lists, etc. as you had intended — it’s for those reasons you mentioned that that I believe makes the effort of being semantic when possible, worth it.

Although I will agree that being 100% semantic in 2004 is impossible ;-)

jim said on May 7, 2004 10:08 PM

we are often in danger of becoming overly puritanical, beating our own backs with twigs until we bleed, just for using an ‘unnecessary’ div or span.

This I agree with. 100% semantic code might be a goal worth attaining, but we shouldn’t be too harsh on ourselves, at least not until the time comes when we can achieve this without losing sleep.

Jon Hermansen said on May 7, 2004 10:12 PM

When reading this article, all I could think of is:
What is the meaning of life?

Bruce said on May 7, 2004 10:56 PM

It will all be better with XHTML 2 which brings in some new elements to take place of DIVs which have semantic meaning, such as <section>. Class names are for humans only, though naming is still important in case someone else is tasked with editing your design. Good commenting and structure are also important. It’s just basic coding really. Nothing more, nothing less.

Aleksandar said on May 8, 2004 1:39 AM

You had a bad coding day, Andy? ;)

Anne said on May 8, 2004 12:15 PM

Giving such names to DIV elements is just plain evil. I know it happens. However, that is not a reason to follow.

See also: http://webstandardsgroup.org/features/anne-van-kesteren.cfm#semantic-meaning and http://annevankesteren.nl/archives/2004/03/a-hole-in-html

David House said on May 8, 2004 5:34 PM

This post inspired some thoughts.

Mike P. said on May 8, 2004 10:40 PM

Google will look at your headlines and think “hmm, that’s a headline, it must be important”.

I’d say they’re doing it now, and have for a long time.

An interesting bit of info on that topic was posted in a forum I frequent; it’s linked via my blog, via the URL attached to my name below (so as to avoid long-ur-itis).

charles s. said on May 9, 2004 7:30 AM

so calling it “leftColumn” and “rightColumn” doesn’t make sense… of course -

but how about -

“mainContent”
“blogPosts”
“/blogPosts”
“blogComments”
“/blogComments”
“/mainContent”

“secondaryContent”
“recentBlogPosts”
“/recentBlogPosts”
“externalLinks”
“/externalLinks”
“ads”
“/ads”
“/secondaryContent”

naming divs based on position on screen doesn’t make sense - naming them based on their content does. maybe i’ve just had too many beers?

mjr said on May 9, 2004 1:04 PM

Because let’s face it, there really is no standard way of semantically identifying page elements outside the regular (X)HTML tags.

Surely there is actually a standard: XML namespaces. In theory if you wanted to you could include semantically-rich elements from, for example, the DocBook standard in your XHTML pages, rather than use divs. Although I realise that many current browsers would have trouble with this the standard is out there.

Combine this with a pinch of XSLT and you get geeks like my coming over all faint with excitement.

Jeff Croft said on May 9, 2004 2:55 PM

It would be kind of interesting is someone came up with a sort-of “standard” for class and id names in typical settings. For example, one could develop a standard set of names for blogs, one for photo galleries, etc.

While the chances of widespread adoption for something like this is small, it would at least be very beneficial within an organization. For example, you boys at Message could all agree to name things the same way on each project, ensuring that when one dude has to look at the another one’s code, you’ll know what the hell he/she was talking about. It’d almost be like a corporate style guide for XHTML code…

CM Harrington said on May 9, 2004 3:57 PM

Sensible naming should be used whenever possible. If you have a typical two column layout, one column is usually the main content, while the other column is meta-content associated with the main content, or the site as a whole.

For example, on my site, I have the right column in a DIV with an ID of “news” (it contains the latest news on the site), while the main content is in a DIV called “summary”. The DIVs themselves are semantic in the sense that they do have meaning. They are a logical grouping of similar elements. Just because you can arbitrarily ID/CLASS your DIV (or span) doesn’t make it presentation only.

Today Google may not care about DIVs, but it could be made to. Just like it weighs text near the beginning of the document higher than that at the end (offset by a myriad of other things), it could consider elements that are grouped in a DIV to be of similar (relational) value.

pid said on May 9, 2004 4:00 PM

Currently the only two names, (set by id or class), that would be of any real use as a standard, are “header” and “footer”.

Given that we’re only talking about machines reading and parsing pages, it would be most helpful for a machine to discount the contents of those elements for the purpose of ‘understanding’ a page etc.

Andrei Herasimchuk said on May 9, 2004 6:42 PM

You’ll wrap a <div> or a <span> around another element (or group of element) to act as a hook for your style. By doing this you’re adding meaningless markup to your code for display reasons. This makes you feel bad, so you’ll try to give these hooks some meaning.

Actually… this whole issue could be resolved by simply not feeling bad about it. 8^)

If I want to wrap a <div> around a section of markup to create a hook to lay a presentation style on it, I’ll do so while feeling good about it.

I don’t expect anyone to be able to define a “semantic” markup language that can possible cover every aspect of what I want to do as a designer. The only solution is to go pure XML and create my own tag definitions to do this on a per project basis. Given how much gets lost going that route on a global scale, I see no problem with adding divs and spans attached with classnames to solve the design and presenation problem.

AkaXakA said on May 9, 2004 9:39 PM

Wel, the most heared reason as to why people should write semantic markup is that it gets you higher google rankings…

There are http://www.seochallenge.com/ that are willing to allow people to come up with new subversive measures to use alongside semantic coding to get the highest ranking for nigritude ultramarine…read more on the slashdot story on nigritude ultramarine.

This kind of contest removes the effectiveness of google, thus making the point of writing semantic markup to get a higher ranking a bit weaker.

AkaXakA said on May 9, 2004 9:39 PM

Wel, the most heared reason as to why people should write semantic markup is that it gets you higher google rankings…

There are http://www.seochallenge.com/ that are willing to allow people to come up with new subversive measures to use alongside semantic coding to get the highest ranking for nigritude ultramarine…read more on the slashdot story on nigritude ultramarine.

This kind of contest removes the effectiveness of google, thus making the point of writing semantic markup to get a higher ranking a bit weaker.

dusoft said on May 10, 2004 12:05 AM

I don’t think DIV does not have semantiv meaning as it is mearking section of HTML. This section can be named Menu (<div id=”menu”>) or some other, but still it’s section. Ok, we can have <section> in XHTML 3.0, but that’s just the same thing.

manuel said on May 10, 2004 2:01 AM

to me, the simple fact that you’re naming a div ‘mainContent’ and I do the same, AND we’ve not personally made an arrangement on this, is a great step towards the idea of a ‘standard’ way of naming elements. Not to speak of ‘mainMenu’, ‘header’ and ‘footer’…

It’s source code and as such is going to be read by someone else, maintaining it later, or someone at css-d helping us, or even by yourself a couple of months later.

It doesn’t have to be semantic web ‘semantics’… just some common-sense-forward-thinking ‘meaning’.

It’s just community de-facto semantics, it’s no big deal, but it’s Just Good.

Paul watson said on May 10, 2004 4:05 PM

Maybe it is that last 2% which is causing the trouble. The rest is quite simple to semantically markup and, to reiterate, it is not just computers that will need that meaning but co-workers trying to figure out quickly which block of code is the header and which is the sidebar.

As ever, balance.

Heath said on May 10, 2004 11:13 PM

I made the comment somewhere else (I can never remember where I say what), but I get the feeling that web coding has really grown from other programming languages and as Jeff Croft mentioned it would certainly be more efficient for a design firm to take it one step further.

If html could be short-handed using a more OO approach, like add li:3(“Blog”,”Archive”,”About”) or something. Then designers could focus more attention on css and layout.

This would really facilitate an expansion on the web, if computers could rely on certain structures.

Just a thought.

Heath

mark said on May 11, 2004 9:46 AM

i’d like to see more standards in the names of elements. I call the page area #stage where as other people call it different things. This will lead to better portability as like Zen Garden (only better) sites could be re-styled with ease.

and FYI google does look at H1 tags and say “this is a heading” its SEO 101 :)

Omiod said on May 12, 2004 4:35 PM

Google is pushing toward semantinc mainly because 40% of G queries are not sponsored (yet), what a waste! :-)