Minerva Logo of the MPG

Computational Methods in Systems and Control Theory


Non-Funded Research Activity

Multicore and (Multi-)GPU Computing




Project director: Researcher: Duration: since 01/2008

Project description:

The progress in modern desktop computer architectures has driven numerical analysts and scientific computing researches to investigate the capabilities of shared memory parallelization techniques in all kinds of numerical algorithms. Multicore CPUs have been wirdely addressed in basic linear algebra libraries as BLAS and LAPACK. Threaded variants of the aforementioned libraries have enabled many algorithms to exploit multiple cores available in modern CPUs. Other algorithms (especially) addressing large and sparse problems still need research.

Especially during the recent years, the evolution of graphics processors (GPUs) has extended their use to many scientific and engineering applications. In particular, their highly parallel architecture makes them suitable for vector operations and especially appealing for linear algebra (LA) applications. To leverage their potential, it is necessary to rethink numerical LA algorithms and codes. The impact of GPUs in LA has motivated the design of a number of recent LA libraries: CUBLAS, CUFFT, FLAME, MAGMA, ... The exploitation of those libraries to generate high performance codes on modern computers is another core investigation topic in this research activity.




Related publications:

@String { CPE = {Concurrency and Comput.: Pract. Exper.} } @Article{KoeS18,
author = {K\"{o}hler, Martin and Saak, Jens},
title = {Frequency Scaling and Energy Efficiency regarding the Gauss-Jordan Elimination Scheme with Application to the Matrix-Sign-Function on OpenPOWER 8},
journal = CPE,
year = 2018,
volume = 31,
number = 6,
month = apr,
doi = {doi.org/10.1002/cpe.4504} }
Frequency Scaling and Energy Efficiency regarding the Gauss-Jordan Elimination Scheme with Application to the Matrix-Sign-Function on OpenPOWER 8;
Köhler, Martin; Saak, Jens;
in Ezzatti, Pablo; Quintana-Qrtí Enrique; Remon, Alfredo; Saak, Jens;: Concurrency and Computation: Practice and Experience  :  vol. 31(6) Special Issue: Power‐Aware Computing (PACO2017);
2018.
@inproceedings{KoeS17,
title = {{Frequency Scaling and Energy Efficiency regarding the Gauss-Jordan Elimination Scheme on OpenPower 8}},
booktitle = { 2nd Workshop on Power-Aware Computing 2017}, year = 2017,
publisher = {Zenodo},
month = may,
doi = {10.5281/zenodo.574664},
url = {https://doi.org/10.5281/zenodo.574664} }
Frequency Scaling and Energy Efficiency regarding the Gauss-Jordan Elimination Scheme on OpenPower 8;
Köhler, Martin; Saak, Jens;
2nd Workshop on Power-Aware Computing 2017 (PACO 2017)  :  
Zenodo; 2017.
DOI: 10.5281/zenodo.574664.
@article{KoeS17, author = {K{\"o}hler, Martin and Saak, Jens},
title = {A {GPU} Accelerated {Gauss-Jordan} Elimination on the {OpenPOWER} platform -- A case study },
journal = PAMM,
volume = {17},
number = {1},
publisher = WileyVCH,
issn = {1617-7061},
doi = {10.1002/pamm.201710390},
pages = {845--846},
year = {2017} }
GPU Accelerated Gauss-Jordan Elimination on the OpenPOWER platform -- A case study;
Köhler, Martin; Saak, Jens;
Proc. Appl. Math. Mech.  :  
WILEY-VCH Verlag; 2017. ISBN/ISSN: 1617-7061
@incollection{ year={2014},
isbn={978-3-319-09152-5},
booktitle={Computational Science and Its Applications – ICCSA 2014},
volume={8584},
series={Lecture Notes in Computer Science},
editor={Murgante, Beniamino and Misra, Sanjay and Rocha, AnaMariaA.C. and Torre, Carmelo and Rocha, JorgeGustavo and Falcão, MariaIrene and Taniar, David and Apduhan, BernadyO. and Gervasi, Osvaldo},
doi={10.1007/978-3-319-09153-2_29},
title={Accelerating Band Linear Algebra Operations on GPUs with Application in Model Reduction},
url={https://dx.doi.org/10.1007/978-3-319-09153-2_29},
publisher={Springer International Publishing},
keywords={Band linear systems; linear algebra; graphics processors (GPUs); high performance; control theory},
author={Benner, Peter and Dufrechou, Ernesto and Ezzatti, Pablo and Igounet, Pablo and Quintana-Ortí, EnriqueS. and Remón, Alfredo},
pages={386-400},
language={English} }
Accelerating Band Linear Algebra Operations on GPUs with Application in Model Reduction;
Benner, Peter; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
in Murgante, Beniamino and Misra, Sanjay and Rocha, AnaMariaA.C. and Torre, Carmelo and Rocha, JorgeGustavo and Falcão, MariaIrene and Taniar, David and Apduhan, BernadyO. and Gervasi, Osvaldo: Computational Science and Its Applications – ICCSA 2014  :  Vol. 8584 of Lecture Notes in Computer Science;
Springer International Publishing; 2014. ISBN/ISSN: 978-3-319-09152-5
@article{BenEQetal14b,
author = {Benner, P. and Ezzatti, P. and Quintana{-}Ort{\\'{\i}} E.~S. and Rem{\\'{o}}n, A.},
title = {Trading Off Performance for Energy in Linear Algebra Operations with Applications in Control Theory},
journal = {{CLEI Electronic Journal}},
volume = {17},
pages = {4--4},
year = {2014} }
Trading Off Performance for Energy in Linear Algebra Operations with Applications in Control Theory;
Benner, Peter; Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
CLEI Electronic Journal  :  Vol. 17;
2014. ISBN/ISSN: 0717- 5000
327Solving Matrix Equations on Multi-Core and Many-Core Architectures;
Benner, Peter; Ezzatti, Pablo; Mena, Hermann; Quintana-Ortí, Enrique; Remón, Alfredo;
Algorithms  :  Vol. 6(4);
2013. ISBN/ISSN: 1999-4893
@incollection{BenEQetal13c,
author = {Benner, P. and Ezzatti P. and Quintana{-}Ort{\\'{\i}} E.~S. and Rem{\\'{o}}n, A.},
title = {Exploiting Data- and Task-Parallelism in the Solution of {R}iccati Equations on Multicore Servers and {GPUs}},
series = AdvParComp,
booktitle = {Parallel Computing: Accelerating Computational Science and Engineering (CSE), Proceedings of the International Conference on Parallel Computing,
ParCo 2013, 10-13 September 2013, Garching, Germany},
volume = {25},
pages = {367--374},
isbn = {978-1-61499-380-3},
publisher = {{IOS} Press},
editor = {Michael Bader and Arndt Bode and Hans{-}Joachim Bungartz and Michael Gerndt and Gerhard R. Joubert and Frans J. Peters},
doi = {10.3233/978-1-61499-381-0-367},
year = {2013} }
Exploiting Data- and Task-Parallelism in the Solution of Riccati Equations on Multicore Servers and GPUs;
Benner, Peter; Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
in Michael Bader and Arndt Bode and Hans-Joachim Bungartz and Hans{-}Joachim Bungartz and Gerhard R. Joubert and Frans J. Peters: Proceedings of the International Conference on Parallel Computing, 2013  :  Vol. 25 Parallel Computing: Accelerating Computational Science and Engineering (CSE);
IOS Press; 2013. ISBN/ISSN: 978-1-61499-380-3
326On the Impact of Optimization on the Time-Power-Energy Balance of Dense Linear Algebra Factoriizations;
Benner, Peter; Ezzatti, Pablo; Quintana-Ortí, Enrique; Remón, Alfredo;
in Aversa, Rocco; KoÅ‚odziej, Joanna; Zhang, Jun; Amato, Flora; Fortino, Giancarlo: Algorithms and Architectures for Parallel Processing  :  Lecture Notes in Computer Science Vol. 8286;
Springer International Publishing; 2013. ISBN/ISSN: 978-3-319-03888-9
261Unleashing CPU-GPU Acceleration for Control Theory;
Benner, Peter; Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
Euro-Par 2012: Parallel Processing Workshops  :  Vol. 7640;
Springer Berlin Heidelberg; 2013. ISBN/ISSN: 978-3-642-36949-0
328Accelerating the Lyapack library using GPUs;
Dufrechu, Ernesto; Ezzatti, Pablo; Quintana-Ortí, Enrique; Remón, Alfredo;
The Journal of Supercomputing  :  Vol. 63 (3);
Springer US; 2013. ISBN/ISSN: 0920-8542
@TECHREPORT{MPIMD12-02,
author = {Peter Benner, Pablo Ezzatti, Enrique S. Quintana-Ortí, Alfredo Remón},
title = {Matrix Inversion on CPU-GPU Platforms with Applications in Control Theory},
institution = {Max Planck Institute Magdeburg Preprints},
year = 2012,
number = {MPIMD/12-02},
month = {February} }
Matrix Inversion on CPU-GPU Platforms with Applications in Control Theory;
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-Ortí; Remón, Alfredo;
Concurrency and Computation: Practice and Experience  :  
John Wiley & Sons, Ltd; 2013. ISBN/ISSN: 1170–1182
260Accelerating Model Reduction of Large Linear Systems with Graphics Processors;
Benner, Peter; Ezzatti, Pablo; Kressner, Daniel; Quintana-Ortí, Enrique S.; Remón, Alfredo;
in Kristján Jónasson: Applied Parallel and Scientific Computing  :  Vol. 7134;
Springer Berlin Heidelberg; 2012. ISBN/ISSN: 978-3-642-28144-0
279High Performance Matrix Inversion on a Multi-core Platform with several GPUs;
Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
in Cotronis, Yiannis; Danelutto, Marco; Papadopoulos, George A.: Proceedings of the 19th International Euromicro Conference on Parallel, Distributed and Network-based Processing, PDP 2011  :  Vol. ?? of;
IEEE Computer Society; 2011. ISBN/ISSN: 978-0-7695-4328-4
220 Numerical Solution of Differential Riccati Equations on Hybrid CPU-GPU Platforms;
Peter Benner, Pablo Ezzatti, Hermann Mena, Enrique S. Quintana-Ortí, and Alfredo Remón;
Proceedings of ALAMA2010 - 2nd Meeting on Linear Algebra, Matrix Analysis, and Applications, Valencia, June 2nd-4th  :  7 pages;
2011. ISBN/ISSN: 978-84-8363-544-5
219Solving Differential Riccati Equations on Multi-GPU Platforms;
Peter Benner, Pablo Ezzatti, Hermann Mena, Enrique S. Quintana-Ortí, and Alfredo Remón;
Proceedings of the 11th International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE 2011  :  Vol. 1 pp. 178-188;
2011. ISBN/ISSN: 978-84-614-6167-7
278Using Hybrid CPU-GPU Platforms to Accelerate the Computation of the Matrix Sign Function ;
Benner, Peter; Ezzatti, Pablo; Quintana-Ortí, Enrique S.; Remón, Alfredo;
in Lin, Hai-Xiang; Alexander, Michael; Forsell, Martti; Knupfer, Andreas; Prodan, Radu; Sousa, Leonel; Streit, Achim: Euro-Par 2009 - Parallel Processing Workshops  :  Lecture Notes in Computer Science Vol. 6043;
Springer; 2010. ISBN/ISSN: 978-3-642-14121-8
@TECHREPORT{KoeS09,
author = {Martin K\"ohler and Jens Saak},
title = {Efficiency improving implementation techniques for large scale matrix equation solvers},
institution = {TU Chemnitz},
year = 2009,
type = {{C}hemnitz {S}cientific {C}omputing {P}rep.},
number = {CSC 09-10},
type = {Preprint} }
Efficiency improving implementation techniques for large scale matrix equation solvers;
Köhler, Martin; Saak, Jens;
CSC Preprint 09-10;
2009.
276An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization;
Quintana-Ortí, Gregorio; Quintana-Ortí, Enrique S; Remón, Alfredo; van de Geijn, Robert A.;
in Laginha, J.M.; Palma, M.; Amestoy, P.; Daydé, M.J; Mattoso, M; Correia, J.: High Performance Computing for Computational Science - VECPAR 2008, 8th International Conference  :  Lecture Notes in Computer Science Vol. 5336;
Springer; 2008. ISBN/ISSN: 978-3-540-92858-4
273Parallel Solution of Band Linear Systems in Model Reduction;
Remón, Alfredo; Quintana-Ortí, Enrique S; Quintana-Ortí, Gregorio;
in Wyrzykowski, Roman; Dongarra, Jack; Karczewski, Konrad; Wasniewski, Jerzy: Parallel Processing and Applied Mathematics, 7th International Conference  :  Lecture Notes in Computer Science Vol. 4967;
Springer; 2008. ISBN/ISSN: 978-3-540-68105-2
274The Implementation of BLAS for Band Matrices;
Remón, Alfredo; Quintana-Ortí, Enrique S; Quintana-Ortí, Gregorio;
in Wyrzykowski, Roman; Dongarra, Jack; Karczewski, Konrad; Wasniewski, Jerzy: Parallel Processing and Applied Mathematics, 7th International Conference  :  Lecture Notes in Computer Science Vol. 4967;
Springer; 2008. ISBN/ISSN: 978-3-540-68105-2

Related talks:





©2024, Max Planck Society, Munich
Jens Saak, Martin Köhler, saak, koehlerm@mpi-magdeburg.mpg.de
31 January 2012