This weekend The Netherlands (and many other countries too, but for the purpose of example I'll stick to NL) experienced something that happens only every few years. We had up to 25 centimeters of snow, which is unusual for us, so it disrupts life significantly. On sunday, buses and trains were canceled, destinations became unreachable, and at Schiphol airport alone, over 700 people had to spent the night because it was impossible to get anywhere.
Snow continued to fall Sunday evening, so this morning most news outlets expected chaos in traffic. The Ibuildings Netherlands offices already advised all its staff to work from home on Monday, but not every company is able to tackle it that way, so the number one question for many people this morning was: "Will I be able to get to work?". The answer to the question should be simple: either trains and buses ride normally, they have delays, or they don't ride at all. Still, many of the Dutch public transport companies struggled to get that information across. Read on for an overview and an analysis, using the public transport companies of the Dutch major cities as an example of how important it is to do 'peak management'.
This morning I had a look at the major public transport websites to see how they dealt with the increase in traffic caused by the weather. I looked at the sites of the transport companies for major cities in the Netherlands:
GVU in Utrecht,
HTM in The Hague,
RET in Rotterdam,
Veolia in Maastricht,
GVB for Amsterdam,
Arriva for several other cities and at
NS.nl and
9292ov, two national public transport websites.
The Good
Both national websites NS.nl and OV9292.nl worked like a charm.
NS
NS.nl is the site of the national Dutch railways, and their site breathed 'business as usual'. All functionality seemed operational, and their regular news section had a short but clear "There will be trains today. but fewer than usual so expect massive delays. You are advised not to use the trains today." message. Although it would've been nice to have a bit more detail, for a website that needs to cover the entire country, at least this was clear and concise, and it worked. The fact that the information was reachable within one click of the homepage, probably helped prevent a cascading effect.
OV9292
OV9292.nl also handled the traffic well, and they had tackled the traffic peak of this morning using a simple concept: they rearranged the information on the website to make sure the information everybody would be looking for was on the homepage, and the more heavy features of the site were now one click away (this is a part of a strategy we call 'graceful degradation'). Here's a screenshot of what this looked like:
The 99% people looking for the information on delayed public transport were instantly satisfied, and those needing the site's regular homepage functionality could still use the menu to go there. Kudos to ov9292.nl for handling the peak this way.
GVB (Amsterdam) and Veolia (Maastricht)
Aside these 2 national sites, the sites of transport company GVB worked fine as well, and its homepage had a clear "most of the buses in Amsterdam are ok" on the homepage. Veolia was the other website that was business as usual.
The Bad
RET (Rotterdam)
The RET (Rotterdam) website was up, but I still listed them in the 'bad' category. Why? I say: if you do something, do it right. RET recently launched a
Twitter account, which is featured prominently on their homepage. Twitter is an ideal way to communicate, but there is not a single tweet on the account since Thursday.
Since at least RET's website was up, I won't complain about their Twitter activity too much. Let's look at some other websites.
GVU (Utrecht)
Utrecht is one of the major public traffic hubs in the Netherlands, given its central place in the country. However GVU's website displayed:
This roughly translates to 'due to a technical malfunction this part of the website is unreachable. We are working on it, sorry for the inconvenience'.
GVU did spread some information:
This message ('No buses until further notice!') was posted since sunday at the central bus station, and luckily someone
tweeted it, which gave a lot of people the information they were looking for, but not with the help of GVU. A simple message on their homepage would've sufficed and would've been so much more relevant than 'the site is unreachable, sorry'. And it's as simple as uploading a simple HTML file. An official tweet from GVU would've helped as well, many people on twitter complained about not only the fact that they could not reach the GVU website, but the callcenter was overloaded as well.
HTM (The Hague)
The city that houses our national government had similar issues. HTM's website displayed:
The site was down, but HTM stil scores points by having a relevant message, which includes an English translation and an alternative phone number.
The Ugly
I reserved the 'ugly' category for those sites that went down and didn't handle it gracefully. Luckily, this only happened to one site.
Arriva (various cities across The Netherlands)
Arriva's site couldn't handle the load. There was no alternative webpage in place, and what's worse, the website wasn't configured to handle the peakload gracefully. A properly configured webserver starts rejecting customers with a nice error message (which we've seen happen in the 'bad' category above), a badly configured webserver gives technical error messages or simply times out. Arriva's site did both; I first got this:
This means their web server could still handle things, but their database server couldn't. Their website in turn wasn't configured to gracefully deal with the database collapsing. Two minutes later, the web server came down as well:
I don't know what web server or technology Arriva is using, but the first thing to do for Arriva's IT team is to look into ways to make the failure more graceful. The HTM website displayed earlier in this article is a good example. It would of course be even better to implement some kind of scaling strategy.
What can companies do to keep the information flowing?
There are a number of things that can be done to cope with peak situations like this:
- Implement a scalability strategy. Have a look at how peaks should be handled, and what can be done to publish the information if the site is under heavy load.
- Use Twitter. More and more people are using Twitter to convey or read real time information. I recommend service departments start considering the use of Twitter.
- Put a temporary HTML file in place. This is the most simple, effective and doable countermeasure. If everything else fails, put up a single HTML file showing the most important information.
- Scale your infrastructure. Consider public or private cloud services if your company has to deal with occasional peaks. Clouds allow you to scale the infrastructure when needed.
Why companies should care
It is startling to see that the majority of the major transport websites had problems. Companies dealing with this problem might think "Why should we care? Events like this don't happen that often, so it's acceptable that our site is down." I disagree. Especially if you are a company that needs to get information across during those moments of chaos, you should make sure that you are able to. Otherwise, you may even be contributing to the overall problem: travelers will not know what to do, flock to stations anyway and cause further chaos.
These are the moments when the information on your site is most relevant, and it's not even that complex to tackle. The argument "we cannot invest in infrastructure to handle those peaks; it would be idle 99% of the time" is weak. If this were about performance, they may be right, but it's about scalability, and scalability can be a core feature of your system which does not require an overdose of infrastructure.
Concepts such as
static file caching, 'graceful degradation', or just the simple rearrangement of the important information, are concepts you can implement effectively in most of the applications. If you are dealing with peak problems similar to what's described in this post, keep an eye on our
Techportal site which has relevant articles about dealing with peaks, or tell your company's IT manager to
contact us if we can help implementing a solid scalability strategy.