Virginia Tech
System X Takes on the Grand Challenge
The power of the Xserve cluster can streamline the amount of time computational scientists need for research, says Dr. Herdman. Some problems may take a month of computer time. We want to cut that down to two or three hours. Instead of doing something in six months, were talking about a matter of days. The speed with which we can design and solve problems is very important to us.
Weve put a state-of-the-art, ahead-of-the-curve machine here for our research faculty to use and theyre very excited about having the Macs here. says Dr. Terry Herdman, director of research computing at Virginia Tech.
The Technology of Accuracy
Dr. Varadarajan points to an aspect of the Xserve G5 thats of particular importance: ECC memory a memory system that uses error-correcting code logic to protect against corrupt data and read/write errors.
Computational science applications vary widely in their run time some run for just minutes or hours, but others churn on for months or even years. During this time, bit errors can occur through electrical interference, for example in the 64-bit binary number.
Think of it as potential loss of long-term memory in humans, Dr. Varadarajan explains. Silicon memory has nothing to do with aging, but it can flip and you might get an incorrect result. The Xserves memory subsystem prevents these errors.
Error correction is vitally important for the kinds of grand-challenge computational science problems handled by System X. With these problems, says Dr. Varadarajan, nobody knows what the data is supposed to look like. Without a way to correct for errors, scientists have to repeat a run five or six times. And, even then, all of their results may only point roughly in the same direction. Their machines are telling them nothing useful.
But with System X, he says, because the Xserve uses error-correcting memory, if one bit flips out of 64, the memory subsystem will detect and correct it.
Asking New Questions
The new Xserve cluster means better science. People are already doing bigger things, Dr. Ribbens says. A bigger computation means a more accurate computation and a better simulation.
Perhaps more important, he says, the Xserve cluster means a big increase in power for the typical user easily by a factor of 100 or more, he says.
That allows scientists to ask different questions.
For example, Ribbens explains, scientists involved in simulations might previously have been able to simulate the physical properties of one material or alloy.
Now, they can not only get a better answer, but they can think about simulating many different alloys or alloys with slightly different properties, he says.
Instead of answering the question Whats going on in this engine, or weather system, or molecule, scientists can ask, What should be going on? Whats the best way to design this molecule? Im going to try 50 different examples and see which one works best.
You move from just analysis to optimization, Ribbens concludes. Before, you might have been happy to do just a few runs in a month. Now, if you can do several runs a day, that really changes the way you think about the science.
