Advanced Computation Group

The Advanced Computation Group (ACG) researches algorithms and high-performance issues relevant to Apple technology. ACG provides internal algorithm services for Apple, but also supports Apple users in the science, education and engineering sectors. Specifically, the activities of the ACG include:

Technical Papers

The following are recent papers by Apple’s Advanced Computation Group. For older papers, particularly those dealing with the PowerPC G4/G5 architectures, please visit the archive.

Large-scale FFTs and convolutions on Apple hardware

Abstract: Impressive FFT performance for large signal lengths can be achieved via a matrix paradigm that exploits the modern concepts of cache, memory, and multicore/multithreading. Each of the large-scale FFT implementations we report herein is built hierarchically on very fast FFTs from the standard Mac OS X Accelerate library. (The hierarchical ideas should apply equally well for low-level FFTs of, say, the OpenCL/GPU variety.) By building on such established, packaged, small-length FFTs, one can achieve on a single Apple machine—and even for signal lengths as large as 2^30-to-2^32—sustained processing rates as high as 20 (40) gigaflop/s for double (single) precision.

Multiprecision Floating-point Arithmetic on Apple Systems

Abstract: This paper describes the use of two software packages which facilitate floating point arithmetic, with arbitrary precision (thousands or even millions of digits), on Apple Macintosh computers. Both packages are available on the Internet free of charge, though the use of both packages in commercial applications is limited by license restrictions. Both PowerPC and Intel Core CPUs are supported by both software packages. Configuration and installation of each package is discussed, relative strengths and tradeoffs associated with the two packages are described, and simple example code is given to illustrate common use of the packages. — February 2007

Gigaelement FFTs on x86-based Apple clusters

Abstract: This paper is an update to previous work on the use of computational nodes for very large Fast Fourier Transforms (FFTs). Test code has been updated for Intel-based hardware — again using the OS X Accelerate framework for all component-FFTs — while new performance measurements are provided for a canonical x86 hardware configuration. These more modern tests exhibit considerable speed advantages. A performance example is this: One can sustain > 2 gigaflops real-time for double-precision gigaelement (length-230-complex) FFTs on a 4-machine cluster.