Liveblogging the Yahoo Developer Conference - Day 1 | December 5, 2007

Yahoo! are having their second internal developer conference this week, and they have very kindly invited a few external people to attend. As well as their London front end team, myself, Simon Willison, Natalie Downe, Glenn Jones, Matthew Somerville and a few other folks are in attendance.

First up is Simon Willison talking about Comet. Comet is essentially an old idea that has gained traction recently due to somebody giving it a name. Unlike Ajax, Comet was deliberately named after a cleaning project.

Comet applications keep a connection open and the event gets pushed to the browser. This is the reverse of Ajax clients polling the server every 5 seconds for updates. Sites like Google docs and Meebo currently use Comet. The concept first appeared in Netscape 1.1 as “server push”. The functionality still remains in modern browsers, but their is no notification when the connection is severed. Because of this, most people roll their own.

The current Comet methods are very hacky and usually use 4 or 5 different techniques to cover all the various browser quirks.Use XHR to open up a connection, and then watch for the ready state to change to indicate new data. This works in good browsers but doesn’t work properly in IE.

The other main method involves opening up an iFrame. However every time an update comes through browsers will click and the loading bar will throb. So a lot of Comet hacks involve hiding the throbbing and clicking sounds. Sane developers don’t want to deal with these issues, so it’s good that a lot of the crazy hacking has been done for you.

All these techniques work on the same domain. However you probably want to have cross-domain comet. For instance, you’re only supposed to have two connections open from a domain at the same time.

The most popular method is long polling. You open the connection, wait till something happens and receive the event. Once you’ve got the event the connection is automatically closed so you re-open the connection for the next event.

Client side Comet really sucks. However the big problem is scaling the server as it requires thousands of simultaneous connections. Apache isn’t set up to do this, so it doesn’t scale. What you need is event based IO. Instead of a thread or process per connection, you have one process that loops through hundreds of connections at a time. To do this, you probably need a separate Comet server.

Bayeaux is a protocol for Comet. Any Beueaux client can talk to a Beueaux server. Data is encoded using JSON. Essentially these servers are black boxes, so are interchangeable. Servers include Meteor and Orbited. Jetty is probably the easiest toi set up.

Despite all the crazy stuff you need, Comet apps are actually really easy to build using the Dojo library. Simon spends the next 5 minutes showing us how to build a simple Comet app.

Next up is Norm talking about coding standards. This is a rerun of the talk he did at BarCamp, so I’m sat at the back of the room, plugged into the Internets.

After a catered lunch, a few of us went to stretch our legs and grab a cuppa. On the way we dropped by the Neals Yard cheese sop so Simon and Nat could put their Christmas orders in. Think I’ll pop back at the end of the day to pick up some supplies. Over lunch we have a lively debate about a range of geekey topics. These ranged from how much better Django is than Rails, the benefits of CouchDB, and some obscure programming language used in the telecoms industry.

After lunch saw a talk on event handling and the YUI. Sadly the person giving the talk was very soft spoke, so I only caught every third word. Must move closer for the next session.

The next session is Nicole Sullivan talking about high performance web sites. I’ve been really impressed with the performance stuff Yahoo have been putting out there recently, so this should be good. In fact, Nicole works for a six person Yahoo team called “Exceptional Performance”.

Nicole starts by talking about their team make-up and the fact that they want Yahoo to be seen as a centre of good performance. In fact, one of their team has just published a book on the subject.

About 95% of user response time comes from the front end, so need to start there. This is where the biggest gains can be made, and they are usually much simpler to implement.

Nicole’s team do a lot of experiments. One of the experiments revolved around caching. They looked at the percentage of people who came in with an empty cache versus a full cache, allowing them to test download speeds. Apparently 40-60% of Yahoo users come in with an empty cache.

The next experiment was to look at parallel downloads. The results showed that having two domain aliases helped speed up response times, but any more than four would slow things down.

Yahoo have 14 rules for high performance websites. Not going to cover them all. This looks very similar to the info Nate has presented in the past.

Rule 1: Make fewer http requests by using CSS sprites and combine scripts and stylesheets. Most big sites don’t do this. Using something called the combo handler, which is unfortunately Yahoo only at the moment.

Some discussion about the ideal size of sprites and the maximum pixel size before Opera suffers a buffer overflow. One suggestion is to use rounded corner boxes with a transparent centre, allowing you to use CSS borders. Sprites aren’t always a good idea on page heavy sites as you pay the price on maintenance. Some discussion on whether you should optimise on a page/module basis or a site basis. Apparently Yahoo Europe have developed a public tool for CSS Sprites.

Rule 3: Add an expires header on images, scripts etc.

Rule 4: Gzip HTML, scripts, stylesheets, XML, JSON etc.

Rule 5: Put stylesheets at the top, as per spec. CSS at bottom is actually faster, but nothing renders. Use link, not @import, as this also appears faster.

Rule 6: Put scripts that aren’t crucial to the loading of the page at the bottom of the page to prevent them from blocking the load.

Rule 7: Avoid CSS expressions as they seem to slow the page. I stupidly thought she meant filters. Doh!

Rule 10: Minify JavaScript and CSS, but don’t obsfucate.

Rule 14: Make Ajax cachable.

Discussed Akamai as a content delivery management system. Norm mentioned Amazon.

Now a talk on PHP security. Potentially interesting, but not my domain, so won’t be taking notes.

Posted at December 5, 2007 10:59 AM


Ed Eliot said on December 5, 2007 11:25 PM

Stuart and I developed a CSS Sprite Generation tool although it was more of a personal project than an official Yahoo! tool -

Re, the opera “overflow” bug the maximum offset (both positive or negative) is 2042px. Anything over this is treated as 2042px so if you have a large sprite image you need to create multiple columns. The above mentioned tool handles this automatically.

Ross Bruniges said on December 6, 2007 8:58 AM

Good to read this stuff Andy; wish I was there myself!!!

Looking forward to reading your stuff on day2 (and 3)??

wayne said on December 7, 2007 6:33 PM

It was Alex Russell who coined it 1st in 2005 - I think it was in a blog post discussing gTalk when it first came out. I remember me and my step dad stayed up late one friday night with our laptops, a load of whiskey and some c++ code forcing packet delays and sniffing traffic to see how it all worked!! Very interesting take on content delivery and application.

Should have waited a bit though as it was shortly after that I read Alex’ blog about Comet!! :s

Stubbornella said on December 16, 2007 6:40 PM

You should always sprite! The question you have to ask yourself is, can you afford to sprite your sprites making a mega-sprite that combines all the images in your site?

The answer is mainly a question of page numbers. If you have 100s or 1000s of pages you need to sprite by module and avoid mega-sprites that will make maintenance much more complicated.

If you have an enormous amount of time to spend on maintenance or very few pages a mega sprite is the right way to go.

It was nice to meet you Andy.