The amount of computing power available to Australian bioinformatics researchers continues to climb, with the Victorian Partnership for Advanced Computing (VPAC) this week switching on a massive $1 million clustered server from IBM that will more than double the organisation's computing power.
That server, an IBM 1350 Linux cluster, is built around 96 dual-processor systems based on 192 of Intel's 2.8GHz Pentium III Xeon processors. That makes it twice the size of a similar system commissioned last month by the University of Melbourne's Melbourne Advanced Research Computing Centre (MARCC); the two machines are housed together in VPAC's data centre in Melbourne.
The new computing power is sorely needed in a scientific community where use of complex models is increasing rapidly. "We've seen the need for a big increase in capacity to run lots and lots of jobs," says VPAC CEO Bill Appelbe. "We have 200 users at VPAC and we're flat to the wall as far as capacity."
As at MARCC, the new VPAC system will be available for a variety of applications, but bioinformatics applications will initially top the list. VPAC will continue to offer researchers access to its earlier IBM pSeries Unix server, a three-year-old system that offers the ability to move data 64 bits at a time compared with 32 bits through the Xeon chips.
Because of their low cost and ready scalability, computing clusters based on off-the-shelf Intel processors are rapidly gaining favour among research communities even though they don't process as much data as the 64-bit processors primarily found in expensive and proprietary Unix systems. The 64-bit technology allows manipulation of massive data sets and problems requiring many gigabytes of system memory -- but this benefit pales in comparison to the strength-in-numbers approach of aggregating large numbers of Intel CPUs.
The processors, which power most of the world's desktop computers, have already played a major part in early efforts -- known as 'Beowulf' clusters -- in which dozens or hundreds of normal desktop PCs are linked together using Linux and software that co-ordinates research between the nodes.
Beowulf clusters had appeared en masse at many universities in recent years, typically under the guidance of programmers seeking to harness the power of systems left idling within locked departmental computer labs during night-time hours.
While they proved the value of clustered solutions in massively parallel number-crunching, Beowulf clusters present their own complications because they become harder to manage and upgrade the larger they get. IBM has made this management a cornerstone of its life sciences push, bundling cluster management software that allows researchers to address all of its servers -- including their disk space, processing power and memory -- as though they belonged to a single machine.
"Until recently, most people building Linux clusters were building them from scratch. But they're just not cost-effective beyond being one person's plaything," said Tony Palanca, IBM's Australia-New Zealand regional manager for life sciences. "But cluster management is getting more mature, and a number of vendors are supporting Linux. One of the focus areas for IBM is to deliver Linux servers as a genuine integrated platform."
That effort will get a significant boost through the VPAC-IBM deal, which will see VPAC hosting a number of IBM-developed tools designed as part of the company's $US200 million life sciences research investment. Those tools include the Tiresias discovery application and BioDictionary, which replaces dozens of applications to let researchers annotate a genome in one pass. The tools have previously been hosted from IBM facilities in the US, but mirroring them at VPAC will provide better performance and utility for Australasian life sciences companies.
By complementing its expanding installed base of computing power with a menu of relevant applications, Palanca said IBM hoped to level the playing field for small local firms with great ideas but few resources to turn them into reality.
"The bio community is dominated by small companies that don't have the in-house expertise to pick up on these tools and learn how to apply them," he explained. "We want to provide a place where these things can be discussed without having to make the big investment up-front."
"We think that's going to have a pretty positive effect," he continued. "It provides a rallying point to begin to work on training researchers how to use the tools, and where we can develop a meaningful collaboration with the academic community to [progress] this whole revolution that's happening around applying IT more effectively to life science."