Volume 10, pp. 21-40, 2000.
Cache optimization for structured and unstructured grid multigrid
Craig C. Douglas, Jonathan Hu, Markus Kowarschik, Ulrich Rüde, Christian Weiss
Abstract
Many current computer designs employ caches and a hierarchical memory architecture. The speed of a code depends on how well the cache structure is exploited. The number of cache misses provides a better measure for comparing algorithms than the number of multiplies. In this paper, suitable blocking strategies for both structured and unstructured grids will be introduced. They improve the cache usage without changing the underlying algorithm. In particular, bitwise compatibility is guaranteed between the standard and the high performance implementations of the algorithms. This is illustrated by comparisons for various multigrid algorithms on a selection of different computers for problems in two and three dimensions. The code restructuring can yield performance improvements of factors of 2-5. This allows the modified codes to achieve a much higher percentage of the peak performance of the CPU than is usually observed with standard implementations.
Full Text (PDF) [333 KB], BibTeX
Key words
computer architectures, iterative algorithms, multigrid, high performance computing, cache.
AMS subject classifications
65M55, 65N55, 65F10, 68-04, 65Y99.
ETNA articles which cite this article
Vol. 15 (2003), pp. 66-77 Malik Silva: Cache aware data laying for the Gauss-Seidel smoother |
< Back