COLSA Corporation
Taking Apple Xserve to MACH5
Left to right: Senior Scientist Dr. John Medeiros, Program Manager Mike Whitlock and Project Engineer Jeff Highfield
A typical problem Medeiros and colleagues face involves computational fluid dynamics (CFD) to model flight for either missile bodies or flow-through scramjet engines.
Depending on how much computational power you have, you define the space around the object of interest with from two to 20 million grid points at which you want to solve the physics of the problem, he says. Then you group the set of points around the object and assign a group to each processor on a multiprocessor machine.
When the calculations run, each processor incrementally solves the relevant equations and compares the results with its nearest neighbors to smooth out the calculations. A typical problem solution requires many thousands of such increments or iterations.
Medeiros elaborates: Depending on the grid points you define, the number of iterations you go through and how many time steps you want to pass through, a problem might take anywhere from several days to several months of 24/7 runtime.
We wanted to put our resources into the best floating-point performance. About 90 percent of the cost went into the computational hardware and about 10 percent into the switch fabric.
Then, if you get anomalous results, the entire calculation may have to be redone or recast in a slightly different way which can take another three months.
Thats the kind of thing were trying to circumvent with the computational power the Apple supercluster gives us.
Exploring Clusters
When Medeiros and COLSA program manager Mike Whitlock began investigating options, they saw clusters as a way to acquire the computational power they needed at a reasonable cost. We found that, for our particular code, the cluster is more suited than the hardware we were running, Medeiros says.
Our particular application requires lots of floating-point performance and minimal requirements for inter-processor switching capabilities. A lot of national labs spend a good chunk of money on the high-end switch for their clusters. Depending on the kind of switch you use, easily half the cost of the cluster can involve the switch fabric alone.
We wanted to put our resources into the best floating-point performance. About 90 percent of the cost went into the computational hardware and about 10 percent into the switch fabric.
Benchmark Testing
Medeiros had investigated clusters earlier in fact, he had experimented with one of the first Apple clusters using 17 Power Mac G4 systems when clusters werent yet fashionable.
But when they looked more closely at different cluster architectures for the new system, Medeiros and project engineer Jeff Highfield tested everything that was out there including the AMD Athlon, Intel Xeon, AMD Opteron, Intel Itanium2 and PowerPC G5.
Using the primary application normally used in production, Medeiros and Highfield did benchmark testing on eight systems. The benchmark test problem involved simplified geometry with two million grid points and aero-thermodynamics of 12 chemical species in the atmosphere and engine combustion products.
