banner image

Data Local Iterative Methods For The Efficient Solution of Partial Differential Equations

logo
home
staff
coorperations
publications
talks
tutorials
software
results
contact

A cooperation
between
lss logo
and
lrr logo.

Funded by
dfg logo.

Publications/Theses

Copyright:

The copyrights of the following papers are held by the publishers. The attached PostScript or PDF files are preprints. Please treat this material in a way consistent with the "fair use" provisions of the appropriate copyright laws.

Bibtex-Support:

Download a bibtex file of the DiME project publications and thesis.

2007 and 2008:

  • M. Stürmer, H. Köstler, and U. Rüde. A fast full multigrid solver for applications in image processing. Numer. Linear Algebra Appl., 15:187–200, 2008.
  • Josef Weidendorfer and Carsten Trinitis. Off-loading Application controlled Data Prefetching in numerical Codes for Multi-Core Processors. Int. J. High Performance Computing and Networking, 4(1):22–28, 2008.
    pdf (preprint)
  • M. Stürmer, J. Treibig, and U. Rüde. Optimising a 3D Multigrid Algorithm for the IA-64 Architecture. International Journal of Computational Science and Engineering (IJCSE), 4(1):29–35 , 2008.
  • Tobias Gradl and Ulrich Rüde. Massively Parallel Multilevel Finite Element Solvers on the Altix 4700. inSiDE, 5(2):24–29, 2007.
    link
  • C. Freundl, T. Gradl, U. Rüde, and B. Bergen. Petascale Computing: Algorithms and Applications, chapter Towards Petascale Multilevel Finite Element Solvers. Chapman & Hall/CRC, December 2007.
  • M. Stürmer, J. Götz, G. Richter, and U. Rüde. Blood Flow Simulation on the Cell Broadband Engine using the Lattice Boltzmann Method. Technical Report 07-9, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, September 2007.
    pdf
  • H. Köstler, M. Stürmer, C. Freundl, and U. Rüde. PDE based Video Compression in Real Time. Technical Report 07-11, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, August 2007.
    pdf
  • M. Stürmer, H. Köstler, and U. Rüde. A fast multigrid solver for applications in image processing. Technical Report 07-6, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, May 2007.
    pdf
  • C. C. Douglas, U. Rüde, J. Hu, and M. L. Bittencourt. A Guide to Designing Cache Aware Multigrid Algorithms. Technical Report 07-3, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, April 2007.
    pdf

2006:

  • B. Bergen, T. Gradl, F. Hülsemann, and U. Rüde. A Massively Parallel Multigrid Method for Finite Elements. Computing in Science and Engineering. 8(6):56–62, December 2006.
  • J. Habich. Improving computational efficiency of Lattice Boltzmann methods on complex geometries. Bachelor´s Thesis, December 2006.
    pdf
  • G. Wellein, T. Zeiser, G. Hager, and S. Donath. On the single processor performance of simple lattice boltzmann kernels. computers & fluids, 35(8–9):910–919, November 2006. ISSN 0045-7930.
    link
  • J. Härdtlein, A. Linke, and C. Pflaum. Blocking Techniques with Fast Expression Templates. Technical Report 06-8, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, November 2006.
    pdf
  • Vlasia Anagnostopoulou. Exploiting multi-core processors for memory-bound numerical codes by using prefetching techniques. Lehrstuhl für Rechnertechnik und Rechnerorganisation/Parallelrechnerarchitektur (LRR), Informatik 10, Fakultät für Informatik, Technische Universität München, Germany, October 2006. Diplomarbeit.
    pdf
  • M. Stürmer, J. Treibig, and U. Rüde. Optimizing a 3D Multigrid Algorithm for the IA-64 Architecture. In M. Becker and H. Szczerbicka, editors, Simulationstechnique - 19th Symposium in Hannover, September 2006, volume 16 of Frontiers in Simulation, pages 271–276. ASIM, SCS Publishing House, September 2006.
    pdf
  • Josef Weidendorfer and Carsten Trinitis. Block Prefetching for Numerical Codes. In Proc. of the ASIM-06 Conf., Frontiers in Simulation. SCS, 2006.
    pdf
  • A. Nitsure, K. Iglberger, U. Rüde, C. Feichtinger, G. Wellein, and G. Hager. Optimization of Cache Oblivious Lattice Boltzmann Method in 2D and 3D. In M. Becker and H. Szczerbicka, editors, imulationstechnique - 19th Symposium in Hannover, September 2006, volume 16 of Frontiers in Simulation, pages 265–270. ASIM, SCS Publishing House, September 2006.
    pdf
  • A. Nitsure. Implemenation and optimization of a cache-oblivious Lattice Boltzmann algorithm. Master´s thesis, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, August 2006.
    pdf
  • J. Götz. Numerical simulation of blood flow with lattice boltzmann methods. Master´s thesis, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, July 2006.
    pdf
  • Josef Weidendorfer and Carsten Trinitis. Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching. volume 3732 of Lecture Notes in Computer Science, pages 921–927. Springer, 2006.
    pdf
  • Markus Stürmer. Optimierung von Mehrgitteralgorithmen auf der IA-64 Rechnerarchitektur. Diplomarbeit, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2006.
    pdf
  • J. Götz. Simulation of bloodflow in aneurysms using the lattice boltzmann method and an adapted data structure. Technical Report 06-6, Lehrstuhl für Informatik 10 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, 2006.
    pdf

2005:

  • S. Donath, T. Zeiser, G. Hager, J. Habich, and G. Wellein. Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures. In F. Hülsemann, M. Kowarschik, and U. Rüde, editors, 18th Symposium Simulationstechnique ASIM 2005 Proceedings, volume 15 of Frontiers in Simulation, pages 728–735. ASIM, SCS Publishing House, September 2005.
  • J. Treibig, S. Hausmann, and U. Rüde. Performance Analysis of the Lattice Boltzmann Method on x-86-64 Architectures. In F. Hülsemann, M. Kowarschik, and U. Rüde, editors, 18th Symposium Simulationstechnique ASIM 2005 Proceedings, volume 15 of Frontiers in Simulation, pages 736–741. ASIM, SCS Publishing House, September 2005.
    pdf
  • B. Bergen. Hierarchical Hybrid Grids: Data Structures and Core Algorithms for Efficient Finite Element Simulations on Supercomputers. PhD thesis, FAU Erlangen, 2005.
  • Josef Weidendorfer and Carsten Trinitis. Collecting and Exploiting Cache-Reuse Metrics. In ICCS 2005: 5th International Conference on Computational Science, volume 3515 of LNCS, pages 191-198. Springer, May 2005.
    pdf
  • Josef Weidendorfer and Carsten Trinitis. Collecting and Exploiting Cache-Reuse Metrics. In ICCS 2005: 5th International Conference on Computational Science, volume 3515 of LNCS, pages 191–198. Springer, May 2005.
  • Stefan Lukowitz. Ermittlung des Quellcodebezugs von Speicherzugriffen auf Datenobjekte. Lehrstuhl für Rechnertechnik und Rechnerorganisation/Parallelrechnerarchitektur (LRR), Informatik 10, Fakultät für Informatik, Technische Universität München, Germany, May 2005. Diplomarbeit.
    pdf
  • B. Bergen, F. Hülsemann, and U. Rüde. Is 1.7×1010 Unknowns the Largest Finite Element System that Can Be Solved Today? In SC ´05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, Washington, DC, USA, 2005. IEEE Computer Society.
    link
  • Markus Stürmer. Optimierung des Red-Black-Gauss-Seidel-Verfahrens auf ausgewählten x86-Prozessoren. Studienarbeit, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2005.
    pdf
  • Simon Hausmann. Optimization an Performance Analysis of the Lattice Boltzmann Method on x86-64 based Architectures. Bachelor Thesis, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2005.
    pdf
  • G. Hager, T. Zeiser, J. Treibig, and G. Wellein. Optimizing Performance on Modern HPC Systems: Learning From Simple Kernel Benchmarks. Conference Paper, March 2005. 2nd Russian-German Advanced Research Workshop on Computational Science and High Performance Computing, German-Russian Center for Computational Technologies and High Performance Computing, Stuttgart, 16.03.2005.

2004:

  • T. Pohl, N. Thürey, F. Deserno, U. Rüde, P. Lammers, G. Wellein, and T. Zeiser. Performance Evaluation of Parallel Large-Scale Lattice Boltzmann Applications on Three Supercomputing Architectures. November 2004. Supercomputing Conference 04.
  • S. Donath. On Optimized Implementations of the Lattice Boltzmann Method on Contemporary Architectures. Bachelor´s Thesis, August 2004.
    pdf
  • Dietrich Christopheit. Vervollständigung von Profiling-Messungen durch Kombination. Lehrstuhl für Rechnertechnik und Rechnerorganisation/Parallelrechnerarchitektur (LRR), Informatik 10, Fakultät für Informatik, Technische Universität München, Germany, August 2004. Diplomarbeit.
    pdf
  • Markus Kowarschik. Data Locality Optimizations for Iterative Numerical Algorithms and Cellular Automata on Hierarchical Memory Architectures. PhD thesis. July 2004, SCS Publishing House, Germany. ISBN 3-936150-39-7.
    ps.gz, pdf
  • Stefan Hamerl. Entwicklung eines komponentenbasierten Designs zur interaktiven Visualisierung und Steuerung von Profiling-Messungen. Lehrstuhl für Rechnertechnik und Rechnerorganisation/Parallelrechnerarchitektur (LRR), Informatik 10, Fakultät für Informatik, Technische Universität München, Germany, July 2004. Diplomarbeit.
    pdf
  • Manfred Hauser. Assemblerbasierte Optimierungen für EPIC- und CISC-Architekturen bei iterativen numerischen Codes. Lehrstuhl für Rechnertechnik und Rechnerorganisation/Parallelrechnerarchitektur (LRR), Informatik 10, Fakultät für Informatik, Technische Universität München, Germany, July 2004. Diplomarbeit. pdf
  • Josef Weidendorfer and Carsten Trinitis. Cache Optimization for Iterative Numerical Codes Aware of Hardware Prefetching. International Conference on Applied Parallel Computing (PARA 04), Copenhagen, Denmark, June 2004.
  • Markus Kowarschik, Iris Christadler and Ulrich Rüde. Towards Cache-Optimized Multigrid Using Patch-Adaptive Relaxation. In /Proceedings of the 2004 Conference on Applied Parallel Computing (PARA'04)/, Copenhagen, Denmark, June 2004. Lecture Notes in Computer Science (LNCS), Springer.
    ps.gz
  • Josef Weidendorfer, Markus Kowarschik, and Carsten Trinitis. A Tool Suite for Simulation Based Analysis of Memory Access Behavior. In Proceedings of the 2004 International Conference on Computational Science, Krakow, Poland, June 2004. Lecture Notes in Computer Science (LNCS), vol. 3038, Springer.
    pdf

2003:

  • Jan Treibig et al. Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures. In Proceedings of the ASIM-05 Conference, volume 2790 of Frontiers in Simulation, pages 441-450. SCS, 2003.
    pdf
  • Markus Kowarschik and Christian Weiß. An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms. Proceedings of the GI-Dagstuhl Forschungseminar: Algorithms for Memory Hierarchies, Lecture Notes in Computer Science (LNCS), Vol. 2625, Springer.
    ps.gz
  • Jens Wilke. Cache Optimizations for the Lattice Boltzmann Method in 2D. Studienarbeit, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2003.
    ps.gz
  • Jens Wilke, Thomas Pohl, Markus Kowarschik, and Ulrich Rüde. Cache Performance Optimizations for Parallel Lattice Boltzmann Codes in 2D. Technical Report 03-3, Lehrstuhl für Informatik 10 (Systemsimulation), University of Erlangen-Nuremberg, Germany, 2001.
    pdf
  • Jens Wilke, Thomas Pohl, Markus Kowarschik, and Ulrich Rüde. Cache Performance Optimizations for Parallel Lattice Boltzmann Codes. Proceedings of the Euro-Par '03 Conference, Klagenfurt, Austria, August 2003. Lecture Notes in Computer Science (LNCS), vol. 2790, Springer. Joint work between our project and the FreeWiHR project.
    ps.gz
  • Klaus Iglberger. Cache Optimizations for the Lattice Boltzmann Method in 3D. Bachelor Thesis, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2003.
    ps.gz
  • Thomas Pohl, Markus Kowarschik, Jens Wilke, Klaus Iglberger, and Ulrich Rüde. Optimization and Profiling of the Cache Performance of Parallel Lattice Boltzmann Codes in 2D and 3D. Technical Report 03-8, Lehrstuhl für Informatik 10 (Systemsimulation), University of Erlangen-Nuremberg, Germany, 2003.
    pdf
  • Thomas Pohl, Markus Kowarschik, Jens Wilke, Klaus Iglberger, and Ulrich Rüde. Optimization and Profiling of the Cache Performance of Parallel Lattice Boltzmann Codes. Joint work between our project and the FreeWiHR project. Parallel Processing Letters, Vol. 13, No. 4 (2003) pp. 549-560.
  • Iris Christadler. Patch-adaptive Relaxation als Glätter im Mehrgitterverfahren. Studienarbeit, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2003.
    ps.gz

2002:

  • Christian Weiß, Hermann Hellwagner, Linda Stals, and Ulrich Rüde. Data Locality Optimizations to Improve The Efficiency of Multigrid Methods. Technical Report 02-1, Lehrstuhl für Informatik 10 (Systemsimulation), University of Erlangen-Nuremberg, Germany, 2002.
    Presented at the 14th Gamm-Seminar on Concepts of Numerical Software in Kiel, Germany, Jan 23-25, 1998.
    ps.gz
  • Nils Thürey. Cache Optimizations for Multigrid in 3D. Studienarbeit, Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany, 2002.
    ps.gz
  • Markus Kowarschik, Christian Weiß, and Ulrich Rüde. Data Layout Optimizations for Variable Coefficient Multigrid. In Proceedings of the 2002 International Conference on Computational Science, Amsterdam, The Netherlands, April 2002. Lecture Notes in Computer Science (LNCS), vol. 2331, Springer.
    ps.gz
  • Markus Kowarschik, Ulrich Rüde, Nils Thürey, and Christian Weiß. Performance Optimization of 3D Multigrid on Hierarchical Memory Architectures. In Proceedings of the 2002 Conference on Applied Parallel Computing (PARA'02), Espoo, Finland, June 2002. Lecture Notes in Computer Science (LNCS), vol. 2367, Springer.
    ps.gz
  • Markus Kowarschik and Christian Weiß. Cache Performance Tuning of Numerically Intensive Codes. Technical Report 02-2, Lehrstuhl für Informatik 10 (Systemsimulation), University of Erlangen-Nuremberg, Germany, 2002.
    ps.gz

2001:

  • W. Karl, M. Kowarschik, U. Rüde, and C. Weiß. DiMEPACK: A Cache-Aware Multigrid Library: User Manual. Technical Report 01-1, Lehrstuhl für Informatik 10 (Systemsimulation), University of Erlangen-Nuremberg, Germany, 2001.
    ps.gz
  • Markus Kowarschik and Christian Weiß. DiMEPACK - A Cache-Optimized Multigrid Library. In H.R. Arabnia, editor, Proceedings of International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2001), volume I, Las Vegas, Nevada, USA, June, 2001. CSREA, CSREA Press.
    ps.gz
  • Christian Weiß. Data Locality Optimizations for Multigrid Methods on Structured Grids, PhD thesis, Lehrstuhl für Rechnertechnik und Rechenrorganisation, Institut für Informatik, Technische Universität München, Munich, Germany, December 2001.
    ps.gz
  • V. Daum. Runtime-Adaptive Techniques for Multigrid Methods. Studienarbeit, Department of Computer Science, University of Erlangen-Nuremberg, Germany, 2001
    ps.gz

2000:

  • Craig C. Douglas, Jonathan Hu, Markus Kowarschik, Ulrich Rüde, and Christian Weiß. Cache Optimization for Structured and Unstructured Grid Multigrid. Electronic Transactions on Numerical Analysis (ETNA), 10:21-40, February 2000.
    ps.gz
  • Craig C. Douglas, Jonathan Hu, Wolfgang Karl, Markus Kowarschik, Ulrich Rüde, and Christian Weiß. Fixed and Adaptive Cache Aware Algorithms for Multigrid Methods. In Multigrid VI. Proceedings of the Sixth European Multigrid Conference held in Gent, Belgium, September 27-30, 1999, Lecture Notes in Computational Science and Engineering (LNCSE), Vol. 14, July 2000.
    ps.gz
  • Markus Kowarschik, Ulrich Rüde, Christian Weiß, and Wolfgang Karl. Cache-Aware Multigrid Methods for Solving Poisson's Equation in Two Dimensions. Computing, 64 (2000), pp. 381-399.
    ps.gz
  • Craig C. Douglas, Jonathan Hu, Mohamed Iskandarani, Markus Kowarschik, Ulrich Rüde, and Christian Weiß. Maximizing Cache Memory Usage for Multigrid Algorithms for Applications of Fluid Flow in Porous Media. In Z. Chen, R.E. Ewing, and Z.-C. Shi (editors), Numerical Treatment of Multiphase Flows and Transport in Porous Media. Proceedings of the International Workshop Held at Beijing, China, August 2-6, 1999, Lecture Notes in Physics, pp. 124ff. Springer, August 2000.
    ps.gz
  • Craig C. Douglas, Gundolf Haase, Jonathan Hu, Markus Kowarschik, Ulrich Rüde, and Christian Weiß. Portable Memory Hierarchy Techniques For PDE Solvers: Part I. Siam News, 33(5), June 2000.
    ps.gz
  • Craig C. Douglas, Gundolf Haase, Jonathan Hu, Markus Kowarschik, Ulrich Rüde, and Christian Weiß. Portable Memory Hierarchy Techniques For PDE Solvers: Part II. Siam News, 33(6), July 2000.
    ps.gz
  • H. Pfänder. Cache optimierte Mehrgitterverfahren mit variablen Koeffizienten auf strukturierten Gittern. Diplomarbeit, Department of Computer Science, University of Erlangen-Nuremberg, Germany, 2000.
    ps.gz
  • M. Zetlmeisl. Performance Optimization of Numerically Intensive Codes - A Case Study From Biomedical Engineering. Studienarbeit, Department of Computer Science, University of Erlangen-Nuremberg, Germany, 2000.

1999

  • H. Wörndl-Aichriedler. Adaptive Mehrgitterverfahren in Raum und Zeit. Diplomarbeit, Department of Computer Science, University of Erlangen-Nuremberg, Germany, 1999.
  • Christian Weiß, Wolfgang Karl, Markus Kowarschik, Ulrich Rüde. Memory Characteristics of Iterative Methods. In Proceedings of the Supercomputing Conference, Portland, Oregon, November 1999.
    ps.gz

1998:

  • Ulrich Rüde. Technological Trends and their Impact on the Future of Supercomputing. In H.-J. Bungartz, F. Durst, and C. Zenger (editors), High Performance Scientific and Engineering Computing, Proceedings of the International FORTWIHR Conference on HPSEC, Lecture Notes in Computational Science and Engineering (LNCSE), Vol. 8, pages 459-471. Springer, March 1998.
    ps.gz

1997

  • Ulrich Rüde. Iterative Algorithms on High Performance Architectures. In Proceedings of the EuroPar97 Conference, Lecture Notes in Computer Science (LNCS), pages 26-29. Springer, August 1997.
    ps.gz
  • Linda Stals, Ulrich Rüde. Techniques for Improving the Data Locality of Iterative Methods. Technical Report MRR97-038, School of Mathematical Sciences, Australian National University, October 1997.
    ps.gz
  • Linda Stals, Ulrich Rüde, Christian Weiß, and Hermann Hellwagner. Data Local Iterative Methods for the Efficient Solution of Partial Differential Equations. In Proceedings of the The Eighth Biennial Computational Techniques and Applications Conference, Adelaide, Australia, September 1997.
    ps.gz

cs10-dime@fau.de
Last Modified: 15 July 2009
Valid HTML 4.01! Powered by vim