Five ways to improve Web site uptime

You can't fix the Internet, but you can tap these tools to decrease downtime for your organization's Web site

Comments

CDNs

are designed to quickly route traffic onto private networks, easing or eliminating the burden on the public Web site itself, Forrester's Staten explains.

Without a CDN, massive media files would cripple a site like Netflix.com or Cinemanow.com almost immediately.

A CDN provider handles congestion by adding "last mile" communication centers in cities that, according to the provider's own data models, absorb most of the traffic. For example, in areas of California, video-over-the-Web is more common than elsewhere in the country, so Akamai or Limelight might install a center in San Francisco to handle the load.

More use of content delivery networks can definitely help, because these services keep traffic at the edges of the Internet rather than having to route it all the way through, says Staten.

According to Staten, one issue with a CDN is that not all content can be cached. Crosland explains that "CDNs are great for static content" such as videos and music but can't be used for dynamic or database-driven information such as search results and Twitter updates. "That's where intelligent caching comes into play," says Crosland.

Jason Mayes, a senior engineer at XMOS Ltd., says that working with Cambridge, Mass.-based Akamai Technologies Inc. played a major role in alleviating pressure when XMOS started offering online videos. XMOS would post a 200MB video, and thousands of users would attempt to access it at the same time, stressing XMOS's 10Mbit/sec. connection. "Videos made our site slower for regular users. This was a major concern, as a Web site reflects a company's ethos," he says.

After implementing Akamai's CDN, Mayes says, all page-delivery times, including those for text and video went from 17 seconds to 5 seconds in Asia, where many of XMOS's site visitors originate. He says he might also look at redesigning his company's content management system (CMS) to make it faster now that the main bottlenecks have been fixed.

3. Use more and better caching

Another common tactic for dealing with Internet problems is to cache content. This technique is becoming more common, according to Pieter Poll, chief technology officer at Qwest Communications International Inc. in Denver, because it enables a site to scale up more easily when users flock to popular content, such as a new episode of CSI: Miami on Hulu.com.

Caching on the Internet works just like memory caching in your computer -- holding the most popular content in a cached storage allocation on the server for fast access. (A CDN is also a type of cache, in that popular content is delivered from a separate node.) Staten says tier-caching products such as Gigaspaces, Oracle Coherence and MemCached help cache content within the Web site infrastructure by making sure that content in a database is accessible at all times, even during the worst traffic spikes.

In fact, caching is one of the main ways that sites like GDGT, Twitter, Facebook and others deal with surges -- the technique is perfect for handling small chunks of data that change often, such as popular articles, forum posts or news items.

Yet caching is still not as widespread as it could be. GDGT uses it to solve ongoing congestion -- although it wasn't enough to keep the site running at launch. But many sites that are just gaining traction, such as the video delivery site Crackle, are still tweaking cache settings.

Poll and other industry observers note that Web usage increases at a rate of about 42% each year, but broadband has not increased at that speed, so caching and CDNs are increasingly important.

4. Use better programming methods

One emerging method of dealing with traffic problems is to program using techniques that can withstand spikes. Brian Sutherland, a managing partner at Vanguardistas LLC, a company that provides a scaling architecture for sites such as that of U.S. News and World Report, says that a vast majority of Web sites are poorly programmed and aren't able to withstand unanticipated traffic spikes.

"There's a large engineering effort required to make sure that a Web site is capable of withstanding a large and sudden load," says Sutherland. A few examples of things that Web site software developers rarely do, according to Sutherland, include regularly benchmarking a representative copy of their servers against a simulated load; having an experienced developer review and approve every software change; and designing for the target performance level right from the start. "When you really want your Web site to stay up, you have to do these things. Twitter grew faster than anticipated and brought in a company after the fact to improve its uptime, which has worked."

According to Sutherland, these techniques -- which mirror methods used in enterprise computing -- might have to wait until Web expansion slows down a bit because developers tend to put speed ahead of reliability. He points out that banking Web sites are good examples of development initiatives that emphasized reliability from the outset, likely because they were subject heavy federal regulations and faced customer demands for reliable financial transactions.

5. Use HTML5 and other emerging standards

Not every method of dealing with Web outages is centered on the hardware or the connection between the site and the user. New standards, especially HTML5, have built-in mechanisms for making a site more reliable. Many of those mechanisms involve the use of advanced programming techniques to address site-to-site transmissions.

"HTML5 is a very important advance in browser capabilities," says Michael Gordon, the chief strategy officer and co-founder of Limelight Networks Inc., a CDN provider based in Tempe, Ariz. Features likely to be important for enterprises, he says, include the canvas tag, which provides dynamic rendering of bitmap images (think Flash-like 2D drawings) that will "significantly advance user interfaces"; the postMessage API, which will allow one Web server to communicate with another through a user's browser; and the client-side storage API, which will allow Web applications to store files on a user client.

In general, Web programming will become more like desktop programming, where data exchange, interface elements and APIs are more solid, and emerging technologies go through a rigorous testing process. One example of this is OpenID, which provides desktop-like functionality (in this case, authenticating one site with the log-in from another) to streamline development. The reusable code and predictable structure for OpenID, OpenSocial, OAuth and other Web standards will make the Web more reliable in the long run.

Not everyone agrees that these standards will promote better Internet uptime, however. Clearly, new standards encourage better programming methods, but they may also lead to even more Web applications and a greater strain.

"HTML5 will allow Web application developers to build richer desktop-like applications, and we will continue to see less dependence on operating systems and more dependence on the Web and Web browsers" to perform the most common tasks, such as managing data shared between Web applications, says Web developer Crosland. All this, in turn, could ironically make things even more congested.

In the end, most experts -- including analysts Skorupa and Staten -- insist that 100% uptime for every site on the public Internet is not necessarily the goal, and that site operators should still plan for occasional outages.

"You can't ensure your site will never go down," says Forrester's Staten. "Each company has to find the right balance between best efforts for availability and the cost of doing so."

John Brandon is a veteran of the computing industry, having worked as an IT manager for 10 years and as a tech journalist for another 10. He has written more than 2,500 articles and is a regular Computerworld contributor.