Dell primes servers for virtualization

PowerEdge R805 and R905 boast VM-friendly features, raise bar for HP and IBM

Benchmarking virtualization hosts

Although virtualization has been around for decades, its surge in adoption is a comparatively recent phenomenon. As such, it does not yet benefit from industry-standard benchmarks. The first widely accepted benchmark, called GrandSlam and designed by IBM, has been retired. A second suite designed by Intel, vConsolidate, runs database, Java, mail, and Web servers and computes a performance rating by combining their results. The general industry perception is that while the vConsolidate approach is valid, its specific implementation tends to disfavor AMD processors. So, it has quietly been left aside by the industry.

The best tests currently available are in a suite called VMmark from VMware (available at no cost from VMware's Web site). It is the most widely quoted benchmark currently in use. However, it is difficult to run (making it hard for in-house analysts to duplicate test results), and it tends to under-represent the importance of RAM. Hence, it's viewed as a useful measure, albeit one that's not truly representative of the typical IT profile. VMmark scores are measured in a peculiar unit called a "tile." As defined by VMware, a tile is a unit of work that aggregates different workloads running simultaneously on a system. The more tiles the system can run, the greater the system capacity. As the tile score computation contains a performance factor, it is fair to view tiles as a measure that represents both performance and scalability.

Examining the posted VMmark scores, we see that the R805 comes in at 7.96 tiles, and the R905 at 14.28. When compared with systems from other vendors, notably HP and IBM, these scores put the R905 at the top of the list in the category of 16-core servers, and the R805 in the middle of the pack.

Energy efficiency is a separate measure that is starting to see the emergence of vendor-neutral benchmarks, most notably SPECpower_ssj2008. This series of tests runs server-side Java (the "ssj" in the benchmark name) on the SUT (system under test) and determines a maximum workload. It then tests the power consumed at every 10 percent of the workload, takes an average of these, and publishes a single-number score that is the average number of ssj operations per watt of energy consumed. This number is possibly useful when comparing two servers in the abstract. But in the day-to-day work of an IT site, the number is problematic. What most sites want to know is how much work the server can do and how many watts it consumes. VMmark provides the former. And we'll now examine the latter.

The design of SPECpower as a ratio makes it comparable to the secondary information on the grocery shelves that tells you how much an ounce of cereal costs, but says nothing about the cost of the entire box or the quality of its contents. And even then, it's difficult to know how ssj and benchmark results map to your particular server. The ssj code is not Java EE-based, and it performs no database access, so the server-side tests are unlikely to duplicate activity at most IT sites. Moreover, the results assume a usage profile that operates the machine for equal periods at 10 percent, 20 percent, 30 percent of load -- all the way up to 100 percent. Again, it's not clear how this maps to actual server usage at most sites. So, for the nonce, I am sticking to the VMmark for performance/workload capacity and raw measures of watts at the wall for power consumed. (I measured the watts with the Kill a Watt meter, which is an excellent, inexpensive tool for measuring power usage.)

