Supercomputer race: Tricky to boost system speed

The Top500 list is always climbing to new heights. Can we believe the hype?

Beep! Beep!

IBM calls Roadrunner, which cost Los Alamos US$120 million, a "hybrid" architecture because it uses three kinds of processors. Basic computing is done on an off-the-shelf, 3,250-node network, with each node consisting of two dual-core Opteron microprocessors from Advanced Micro Devices.

But Roadrunner's magic comes from a network of 13,000 "accelerators" in the form of Cell Broadband Engines originally developed for the Sony PlayStation 3 video game console and later enhanced by IBM. Each Cell chip contains an IBM Power processor core surrounded by eight simple processing elements.

The Cells are optimized for image processing and mathematical operations, which are central to many scientific applications. A Cell can work on all the elements in a well-defined string or vector, ideal for the matrix math in the Linpack benchmark. Los Alamos says the Cells speed up computation by a factor of four to nine over what the Opterons alone could do. Nevertheless, the lab says it expects its production programs to run at sustained speeds of 20 percent to 50 percent of the celebrated 1 petaflops benchmark results.

The advantages of using three kinds of processors come at a cost. Just as the Linpack code had to be optimized for the machine, so do most other programs. A recent report from Los Alamos said this of the effort required to get an important simulation tool to run on Roadrunner: "Accelerating the Monte Carlo code called Milagro took many months, several false starts and modifications of 10 percent to 30 percent of the code." But in the end, the lab said, Milagro ran six times faster with the Cell chips than without them, and that was "a crucial achievement for the acceptance of Roadrunner."

Andrew White, Roadrunner project director at Los Alamos, told Computerworld that the effort to port and optimize code for Roadrunner was "less than we thought it would be" after programmers got some experience with it. A program with "tens of thousands of lines of code" is taking about one man-year to get going on the supercomputer, he said.

Invoking specialization

University of Tennessee computer science professor Jack Dongarra is one of the developers of the Linpack benchmark and a co-publisher of the Top500 report. He calls Roadrunner a "general-purpose computer" but one that, because of its hybrid architecture, "specializes in what it can do." Invoking that specialization is not trivial, he admits.

"If you are writing a program for Roadrunner, you essentially have to write three programs -- one for the AMD Opteron processor, one for the Power core that's on the Cell chip and one for the vector units in the Cell chip," he says. "The only way to get to a point where you'd be happy with the performance is to rewrite your old applications. The guys at Los Alamos believe that they can in fact benefit by rewriting their code."

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags supercomputers

Show Comments
[]