Computerworld

Data Center Gets Star Treatment

As it rushed to complete work on Star Wars: Episode III-Revenge of the Sith last February, special effects company Industrial Light and Magic found itself split between two worlds. The new home of the San Rafael, California-based studio was in the final phase of construction as part of the Letterman Digital Arts Center (LDAC), a 850,000 -square-foot, four-building campus in San Francisco's Presidio National Park. Two of those buildings today serve as headquarters for George Lucas' Lucasfilm as well as its ILM and LucasArts Entertainment Co. subsidiaries.

ILM had been given responsibility for moving IT operations for all three business units. But Chief Technology Officer Cliff Plumer was also in an enviable position. His group had a rare opportunity to create an IT infrastructure, including new data centers and the network, from the ground up.

That February, however, ILM didn't have the processing power in its overcrowded data center in San Rafael to finish rendering all of the movie frames on time, and it didn't have the space for more servers. Bringing down the 2,500-processor server farm to move it would have had a huge impact on operations, since it runs 24 hours a day, says Plumer.

Keeping Star Wars fans waiting was not an option, so ILM bought an additional 250 dual-processor blade servers, installed them in the new data center at the LDAC 20 miles away and connected them into the render server farm in San Rafael by way of a 10Gb/sec. fiber-optic link.

Today ILM resides in a quiet setting with views of the Golden Gate Bridge. From the outside, the style of the buildings is more in keeping with the former Army base's heritage than what one would expect for a high-tech special effects and movie production company. There are no signs, and even at the main entrance, hidden behind plantings, the only indication of who occupies the buildings is a statue of Star Wars character Yoda atop a fountain.

Inside, everything is state of the art. The multistory buildings include a 13,500-square-foot data center, 18-inch raised floors that accommodate more than 600 miles of network cabling, and a 3,000-square-foot media data center. The latter is capable of simultaneously delivering high-definition video to remote clients and to several in-house viewing spaces.

The main data center includes a 3,000-processor server farm, approximately 150TB of network-attached storage and a 10 Gb Ethernet backbone that may be the largest built by any company to date. It has some 340 10Gb ports and supports traffic loads of 130TB per day. Power and cooling systems sit in two adjacent rooms, which Plumer says helps to keep maintenance traffic out of the data center.

Going Live

Getting moved wasn't easy. The data center was ready to go back online in February when the new servers and other equipment arrived at the LDAC. The IT staff had already moved in, becoming the building's first tenant. But the rest of the building was far from finished. "We had to wear hard hats and goggles" during those first weeks, recalls network engineer Mike Runge.

With all the construction, finding a place to store equipment shipments -- dual-Opteron Titan64 Superblades from Angstrom Microsystems and networking gear from Foundry Networks -- was tough. "It was hard to find a room that would lock and be free of dust," Runge says. Once the equipment was unboxed, however, the installation took Angstrom technicians just two hours, says Lalit Jain, Angstrom's CEO.

"Within seconds of powering [the servers] up, they were processing an image," Plumer says. Since then, the rest of the data center equipment has moved over. The bulk of it arrived in mid-August, when ILM's 400 artists and other staffers began moving in.

ILM's move is part of a corporate consolidation that also includes video game maker LucasArts as well as Lucasfilm. Some 1,500 people work in the new facility. Moving LucasArts and Lucasfilm was fairly straightforward, says Kevin Clark, director of IT operations. ILM was more difficult. "The infrastructure is much more complex," he says.

Page Break

Net Gains

The LDAC project gave Plumer's team a unique opportunity to rebuild its IT infrastructure from scratch. The team started by interviewing users on their needs, says Gary Meyer, systems engineer and project manager. From there, a narrative description of the technical infrastructure was developed and given to the design teams.

"The biggest key is the networking infrastructure," says Plumer.

"This industry tends to be a good 10 years ahead of general business in terms of critical network-capacity needs and capability," says Rob Enderle, principal at Enderle Group in San Jose. ILM "will probably be passed relatively quickly, given [that] this need crisscrosses their industry."

The architecture consists of three networks: one for a new voice-over-IP telephone network and two separate 10Gb network cores. One is for video in the media data center, and the other is for the main data center, which handles the render server farm and back-end business systems. A 10Gb fiber backbone runs from the data centers to each building and out to the distribution closets. All employees now have 1Gb/sec. connections, up from 100Mb/sec. in the old facilities. ILM also pulled fiber to each artist workstation. "Putting the fiber in gives us the ability to go to 10 Gb or greater to the desktop," Meyer says.

Meyer won't be surprised if ILM's artists max out their 1Gb connections within a year. Between downloading very large files and streaming high-definition video to the desktop, they could start to fill up the pipe, he says.

Those kinds of anticipated bandwidth demands resulted in very strict requirements for network equipment, says Runge. "We spent months doing a bake-off between several vendors," he says. Foundry won because it dropped the fewest packets-a critical metric for an organization that needs to run multiple high-definition video streams. While the new buildings gave Plumer a blank slate for a new data center, the need for more space wasn't the biggest issue. "It's more about power and cooling," he says. During the data center's design phase, heat and power-density requirements for IT equipment rose faster than anyone expected. The original design called for 200 watts per square foot.

"Partway through the process, we threw up a flare and said, 'We think we've made a mistake. We think we should design for 400 watts per square foot.' And we were basically laughed out of the room," says Meyer. Today, the room supports 330 to 340 watts per square foot and could easily consume 400, he says.

One major reason for the increase was the server farm used to render movie images frame by frame. As ILM has adopted blade servers, power density has gone up from 10 kilowatts per rack a few years ago to nearly 20 kilowatts for its blade servers today. ILM adjusted the original data center design but still has had to spread out blade servers to dissipate heat. "It's a constant job of balancing the room," Plumer says.

Data on the Move

Handling storage needs during the transition was another challenge. ILM had 18 Network Appliance R200 filers connected to 68TB of storage in San Rafael. Those arrays needed to be online around the clock in order to feed files to the render server farm.

ILM was also using SpinFS from Spinnaker Software Solutions, a distributed file system that virtualizes storage and establishes a single, unified namespace that all of the filers use. SpinFS eliminated a performance bottleneck that resulted when many machines in the render farm requested the same data at the same time.

ILM uses the technology to distribute the data across multiple disk arrays, says systems developer Mike Thompson. ILM also used it to migrate data between San Rafael and the LDAC.

Thompson added another 78TB of near-line storage and deployed another 10 R200s running SpinFS in the LDAC. Then he connected them over the 10Gb link to the arrays in San Rafael. "No matter which [end] you are on, you see all the storage," he says. Using the near-line storage as a buffer, Thompson pulled arrays out of the storage pool in San Rafael and reconnected them in the LDAC without disrupting operations. It's now used as a place to store completed projects until the data is ready for migration to tape.

Once the last staffers and equipment from the three organizations are finally moved in, the data center will be at about 60% of capacity, Plumer says. The infrastructure design, as deployed, is supposed to last five years. Already, however, the IT staff is anticipating new needs.

"We're migrating production to 64-bit," says Plumer, which means swapping out older servers for units with dual-core Opteron processors. And the film industry could be moving to 4K frames, which would double the storage requirements.

"We'll stay at 68TB for a year or two," Thompson predicts. "But as shots get more complex ... it's hard to tell."