Scyld Software's Becker on Linux, clustering, grid

More work needs to be done on Linux at the operating system level, grids will have limited appeal, and there will be a mass movement to embrace clustering software among organizations large and small. These were some of the conclusions and predictions of Don Becker, cofounder of the Beowulf clustering project and a significant contributor to the Linux kernel, as he attended the LinuxWorld show in San Francisco last week

Becker is the founder and chief scientist of Linux clustering vendor Scyld Software, a subsidiary of Linux workstation and server vendor Penguin Computing. Privately held Penguin acquired Scyld in June 2003. Becker founded Scyld (pronounced "scaled" or "skilled') back in 1998, building on work he did while at NASA (the U.S. National Aeronautics and Space Administration) where he started the Beowulf Parallel Workstation high performance clustering computing project. NASA was interested in his project for helping in the modeling of climate data.

IDG News Service caught up with Becker as he took a quick break from demonstrating Scyld clustering software at the show.

What are your thoughts on how Linux has developed? Symantec Corp. and other vendors at the show have been talking about Linux entering a golden age in terms of adoption by enterprises, do you agree?

Linux has evolved tremendously. When I started with Linux as an end user, there were probably a few hundred users, and that's probably an overestimation. I quickly became a developer because Linux didn't have reasonable networking and soon afterwards it needed networking. Linux has fulfilled the promise of Unix, going from [running on] a wristwatch to the fastest [high-end] machine.

Linux is state of the art in most cases, but [being] state of the art isn't good enough. There are many holes in what Linux does, there are still many opportunities [for developers.] Not everything has been done. Five years ago, Linus [Torvalds, the creator of Linux] said the basics had been done and all the interesting things [to be done] would be at the application level. That has turned out not to be the case.

[There's work to be done in] the storage and file systems areas which are changing very quickly. Linux clearly needs a general purpose and easily usable network attached storage. iSCSI is a rapidly evolving technology and within the next year or two, we should see some products or implementations you wouldn't call 'clusters,' but enterprisewide storage shared across multiple systems. There's not a name for it yet.

There's also evolution going on inside the Linux kernel, evolving the VFS (virtual file system) layer. Linux isn't behind any [other operating] systems, it's just that there's a lot of development going on there.

What about grids? They seem to be a major focus at LinuxWorld, although often in terms of vendors trying to target customers leery of the associated complexity they perceive around grids.

Grid tools have been primarily developed on Linux, so that's their platform of choice. A grid implementation is much more difficult than clustering. You need to develop a whole new infrastructure, and then deploy and update it in ways that are usable. It's at least as difficult as deploying a network protocol. Look at IPV6 (Internet Protocol version 6), we're not even halfway there to deploying it.

People don't seem to have a good definition for grid and clustering. We at Scyld define a cluster as something you can administer from a single point and where you can install applications so they're immediately available. A grid has separate administration in the domain part of its definitions and you're trying to work with people across a company and across the world.

When vendors are talking about grid [computing] for small to midsize businesses (SMBs), it's more likely companies are deploying utility computing or clustering. The 'grid' term tends to cover 'true grid' in terms of wide area cooperation, utility computing, and clustering.

The real opportunity will be as people start easily incorporating clustering with every machine [they have], so each machine will be the start of clustering and can scale up. Utility computing will be making these sets of clusters work together. The only place for 'true grid,' where it has a significant likelihood of success, will be with global enterprises that have a single point of administration for the CIO (chief information officer).

What are you doing at Scyld to make your clustering more applicable for mainstream use?

We've always focused on ease of use. In the past, we focused on high performance computing, a rather narrow market, and we need to do additional things to make Scyld more applicable elsewhere. We're improving our logging system, working on lights-out diagnostics, and defining the boot system so can you can reconfigure a wider variety of hardware automatically.

Our definition of clustering is that you take independently operating machines, put them together, and try to make it look like a single unified system so it can run as a single high-performance system or as a more reliable system. Many of the vendors offering clustering are not doing interesting development and they're selling ad hoc tools. The research groups are doing it right and rethinking the problem. The end user doesn't need to know they're running a cluster. Later on, they'll need to know a bit to get benefits out of clustering, but they shouldn't have to read a thick book in order to be able to set up a cluster.

We've also heard a good deal about open source licenses at LinuxWorld. Are there too many today?

I think there's room in the space for about 10, maybe not even that many. The people writing the licenses are probably better at developing code than writing licenses, the terms don't always make sense. The hot button is [defining] commercial distribution versus noncommercial distribution. It's very hard to figure out where the line falls. Commercial distribution clearly implies there's some payment involved, a transaction, but it isn't clear enough.

The GPL (General Public License) is clearly not perfect, but like any major standard you deal with in the technical area, it's working well enough. It's not as intuitive as it needs to be.

Given that you used to work at NASA, what's your take on the Shuttle?

I still follow it quite a bit. An unfortunate point is that much of the science NASA is doing is now underfunded because of the energy they've put into the Space Station and the Shuttle.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about HISNASAPenguin ComputingPromiseShuttleSymantec

Show Comments
[]