Web content management


Web content-management systems label and track information that's placed on a Web site so that it can be easily located, modified and reused. These systems are a critical component in personalising Web pages for site visitors.

Imagine a library without the Dewey decimal system, and you'll have a pretty good idea of the chaos that is a large Web site. Content-management systems can tame that chaos by cataloguing Web page data for quick and efficient tracking, editing and reformatting.

HTML, the basic language of the Web, simply describes how text, graphics and other data should be presented on a Web screen. It doesn't describe the data itself and offers little help when a webmaster needs to locate and modify particular documents. HTML by itself is static; once the page is posted to the Web, it must be modified off-line and re-posted in order for any changes to take place.

Power to the people

But the real power of the Web is its ability to move new information to the customer in near-real-time and to customise that information to suit individuals. That customisation, known in the Web world as individualisation or personalisation, is virtually impossible with static HTML pages, especially with the ongoing shortage of trained Web technicians. It's hard enough to create every Web page once, let alone regenerate that same page every time a change is needed. You'd need an army of Web page producers to create the custom pages of, say,, which presents data based on customer preferences and past buying.

Dynamically generated Web pages, however, give Web site managers the ability to create a Web page once and then pour information into the page many times. Dynamic Web page generation lets Web technicians create an overall template once, with fields for customer-specific information. Then servers can pour specific data into the template to create individualised pages on demand.

Content management marshals information into labelled buckets of data that can be used again ("re-purposed") or quickly updated to reflect information without needing human attention. At its simplest, Web content management resembles a word processor's mail-merge function that can mass-mail thousands of form letters, each containing customer-specific data.

The theory behind most content-management applications is simple: you build a set of Web page templates, hook them up to a content server, add a back-end database of information and attach the whole thing to a Web server. The content server automatically pulls information from the database, wrestles it into appropriate formats and stuffs the correct data into templates, generating new and updated pages automatically. Employees with little or no Web training can update content directly without ever touching a Web page. They simply enter information into database forms.

Dynamically generated Web sites are more likely to be up to date and consistent in presentation. Design changes can propagate rapidly and automatically throughout an entire site. And most content-management systems include a workflow system that routes data automatically from creator to editor to approver. They can often lock unauthorised users out of the creation and edit cycle and provide an audit trail for error tracking and version control that allows users to return to a previous version of the site.

Most content-management systems support tagging structures that allow content reuse without manual reformatting. XML, the best known, uses an HTML-like structure to describe the data on a page. Content-management systems also generally employ scripting languages such as Tool Command Language and JavaScript.

In practice, content management can be difficult and expensive. Developing an effective content-management system for a large Web site takes a great deal of expert customisation, especially in building scripts to handle data flow and in constructing effective templates. Each new type of Web page requires new templates and workflow scripting, which can block innovation.

Large organisations with hundreds of thousands of Web pages often have multiple Web sites, each with different needs, data, formats and locations. Pre-existing Web sites must convert thousands of pages to the new content-management system.

The problem grows even worse when organisations exchange data destined for Web sites. XML needs additional data descriptions, called tags or document type definitions (DTD), to adequately define the content of most documents. DTDs tend to be subject-specific and aren't easily passed between organisations.

ICE is cool

Many of the standards built around XML are still in flux. However, the Information and Content Exchange (ICE) protocol is specifically designed to alleviate many content-management and data-exchange problems. (To learn more about ICE, go to .) First recognised by the World Wide Web Consortium in 1998, ICE describes how managed content should be passed between Web sites. It provides a common vocabulary of terms and methods for exchanging data using XML. It will be especially useful in creating syndicated content - information leased or sold to multiple Web sites - with a minimum of translation hassles.

The first version of the standard was proposed by leading content-management developers, including Microsoft, Sun Microsystems, Adobe Systems and Vignette. It applies specific formatting rules to virtually any kind of data that can be presented on a Web site, even down to mundane items such as date and time.