High Performance Computing
Herausgegeben von Jeannot, Emmanuel; Zilinskas, Julius
High Performance Computing
Herausgegeben von Jeannot, Emmanuel; Zilinskas, Julius
- Gebundenes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
With recent changes in multicore and general-purpose computing on graphics processing units, the way parallel computers are used and programmed has drastically changed. It is important to provide a comprehensive study on how to use such machines written by specialists of the domain. The book provides recent research results in high-performance computing on complex environments, information on how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems, detailed studies on the impact of applying heterogeneous computing practices to real problems, and…mehr
Andere Kunden interessierten sich auch für
- Jean-Marc PiersonLarge-Scale Distributed Systems and Energy Efficiency129,99 €
- Samee U. KhanScalable Computing and Communications186,99 €
- Fayez GebaliAlgorithms and Parallel Computing144,99 €
- Amol B. BakshiArchitecture-Independent Programming for Wireless Sensor Networks135,99 €
- Anthony J. G. HeyGrid Computing186,99 €
- Dan C. MarinescuInternet Workflow Management186,99 €
- Neeraj Kumar GoyalInterconnection Network Reliability Evaluation195,99 €
-
-
-
With recent changes in multicore and general-purpose computing on graphics processing units, the way parallel computers are used and programmed has drastically changed. It is important to provide a comprehensive study on how to use such machines written by specialists of the domain. The book provides recent research results in high-performance computing on complex environments, information on how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems, detailed studies on the impact of applying heterogeneous computing practices to real problems, and applications varying from remote sensing to tomography. The content spans topics such as Numerical Analysis for Heterogeneous and Multicore Systems; Optimization of Communication for High Performance Heterogeneous and Hierarchical Platforms; Efficient Exploitation of Heterogeneous Architectures, Hybrid CPU+GPU, and Distributed Systems; Energy Awareness in High-Performance Computing; and Applications of Heterogeneous High-Performance Computing.
- Covers cutting-edge research in HPC on complex environments, following an international collaboration of members of the ComplexHPC
- Explains how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems
- Twenty-three chapters and over 100 illustrations cover domains such as numerical analysis, communication and storage, applications, GPUs and accelerators, and energy efficiency
- Covers cutting-edge research in HPC on complex environments, following an international collaboration of members of the ComplexHPC
- Explains how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems
- Twenty-three chapters and over 100 illustrations cover domains such as numerical analysis, communication and storage, applications, GPUs and accelerators, and energy efficiency
Produktdetails
- Produktdetails
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- 1. Auflage
- Seitenzahl: 512
- Erscheinungstermin: 3. Juni 2014
- Englisch
- Abmessung: 235mm x 157mm x 32mm
- Gewicht: 842g
- ISBN-13: 9781118712054
- ISBN-10: 1118712056
- Artikelnr.: 39757412
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- 1. Auflage
- Seitenzahl: 512
- Erscheinungstermin: 3. Juni 2014
- Englisch
- Abmessung: 235mm x 157mm x 32mm
- Gewicht: 842g
- ISBN-13: 9781118712054
- ISBN-10: 1118712056
- Artikelnr.: 39757412
EMMANUEL JEANNOT is a Senior Research Scientist at INRIA. He received his PhD in computer science from École Normale Supérieur de Lyon. His main research interests are processes placement, scheduling for heterogeneous environments and grids, data redistribution, algorithms, and models for parallel machines. JULIUS ILINSKAS is a Principal Researcher and a Head of Department at Vilnius University, Lithuania. His research interests include parallel computing, optimization, data analysis, and visualization.
Contributors xxiii Preface xxvii PART I INTRODUCTION 1 1. Summary of the
Open European Network for High-Performance Computing in Complex
Environments 3 Emmanuel Jeannot and Julius Zilinskas 1.1 Introduction and
Vision 4 1.2 Scientific Organization 6 1.3 Activities of the Project 6 1.4
Main Outcomes of the Action 7 1.5 Contents of the Book 8 PART II NUMERICAL
ANALYSIS FOR HETEROGENEOUS AND MULTICORE SYSTEMS 11 2. On the Impact of the
Heterogeneous Multicore and Many-Core Platforms on Iterative Solution
Methods and Preconditioning Techniques 13 Dimitar Lukarski and Maya
Neytcheva 2.1 Introduction 14 2.2 General Description of Iterative Methods
and Preconditioning 16 2.3 Preconditioning Techniques 20 2.4
Defect-Correction Technique 21 2.5 Multigrid Method 22 2.6 Parallelization
of Iterative Methods 22 2.7 Heterogeneous Systems 23 2.8 Maintenance and
Portability 29 2.9 Conclusion 30 3. Efficient Numerical Solution of 2D
Diffusion Equation on Multicore Computers 33 Matjaz Depolli, Gregor Kosec,
and Roman Trobec 3.1 Introduction 34 3.2 Test Case 35 3.3 Parallel
Implementation 39 3.4 Results 41 3.5 Discussion 45 3.6 Conclusion 47 4.
Parallel Algorithms for Parabolic Problems on Graphs in Neuroscience 51
Natalija Tumanova and Raimondas Ciegis 4.1 Introduction 51 4.2 Formulation
of the Discrete Model 53 4.3 Parallel Algorithms 59 4.4 Computational
Results 63 4.5 Conclusions 69 PART III COMMUNICATION AND STORAGE
CONSIDERATIONS IN HIGH-PERFORMANCE COMPUTING 73 5. An Overview of Topology
Mapping Algorithms and Techniques in High-Performance Computing 75 Torsten
Hoefler, Emmanuel Jeannot, and Guillaume Mercier 5.1 Introduction 76 5.2
General Overview 76 5.3 Formalization of the Problem 79 5.4 Algorithmic
Strategies for Topology Mapping 81 5.5 Mapping Enforcement Techniques 82
5.6 Survey of Solutions 85 5.7 Conclusion and Open Problems 89 6.
Optimization of Collective Communication for Heterogeneous HPC Platforms 95
Kiril Dichev and Alexey Lastovetsky 6.1 Introduction 95 6.2 Overview of
Optimized Collectives and Topology-Aware Collectives 97 6.3 Optimizations
of Collectives on Homogeneous Clusters 98 6.4 Heterogeneous Networks 99 6.5
Topology- and Performance-Aware Collectives 100 6.6 Topology as Input 101
6.7 Performance as Input 102 6.8 Non-MPI Collective Algorithms for
Heterogeneous Networks 106 6.9 Conclusion 111 7. Effective Data Access
Patterns on Massively Parallel Processors 115 Gabriele Capannini, Ranieri
Baraglia, Fabrizio Silvestri, and Franco Maria Nardini 7.1 Introduction 115
7.2 Architectural Details 116 7.3 K-Model 117 7.4 Parallel Prefix Sum 120
7.5 Bitonic Sorting Networks 126 7.6 Final Remarks 132 8. Scalable Storage
I/O Software for Blue Gene Architectures 135 Florin Isaila, Javier Garcia,
and Jesús Carretero 8.1 Introduction 135 8.2 Blue Gene System Overview 136
8.3 Design and Implementation 138 8.4 Conclusions and Future Work 142 PART
IV EFFICIENT EXPLOITATION OF HETEROGENEOUS ARCHITECTURES 145 9. Fair
Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous
Systems 147 Hamid Arabnejad, Jorge G. Barbosa, and Frédéric Suter 9.1
Introduction 148 9.2 Concurrent Workflow Scheduling 153 9.3 Experimental
Results and Discussion 160 9.4 Conclusions 165 10. Systematic Mapping of
Reed-Solomon Erasure Codes on Heterogeneous Multicore Architectures 169
Roman Wyrzykowski, Marcin Wozniak, and Lukasz Kuczynski 10.1 Introduction
169 10.2 Related Works 171 10.3 Reed-Solomon Codes and Linear Algebra
Algorithms 172 10.4 Mapping Reed-Solomon Codes on Cell/B.E. Architecture
173 10.5 Mapping Reed-Solomon Codes on Multicore GPU Architectures 178 10.6
Methods of Increasing the Algorithm Performance on GPUs 181 10.7 GPU
Performance Evaluation 185 10.8 Conclusions and Future Works 190 11.
Heterogeneous Parallel Computing Platforms and Tools for Compute-Intensive
Algorithms: A Case Study 193 Daniele D'Agostino, Andrea Clematis, and
Emanuele Danovaro 11.1 Introduction 194 11.2 A Low-Cost Heterogeneous
Computing Environment 196 11.3 First Case Study: The N-Body Problem 200
11.4 Second Case Study: The Convolution Algorithm 206 11.5 Conclusions 211
12. Efficient Application of Hybrid Parallelism in Electromagnetism
Problems 215 Alejandro Alvarez-Melcon, Fernando D. Quesada, Domingo
Gimenez, Carlos Pérez-Alcaraz, Jose-Gines Picon, and Tomas Ramírez 12.1
Introduction 215 12.2 Computation of Green's functions in Hybrid Systems
216 12.3 Parallelization in Numa Systems of a Volume Integral Equation
Technique 222 12.4 Autotuning Parallel Codes 226 12.5 Conclusions and
Future Research 230 PART V CPU + GPU COPROCESSING 235 13. Design and
Optimization of Scientific Applications for Highly Heterogeneous and
Hierarchical HPC Platforms Using Functional Computation Performance Models
237 David Clarke, Aleksandar Ilic, Alexey Lastovetsky, Vladimir Rychkov,
Leonel Sousa, and Ziming Zhong 13.1 Introduction 238 13.2 Related Work 241
13.3 Data Partitioning Based on Functional Performance Model 243 13.4
Example Application: Heterogeneous Parallel Matrix Multiplication 245 13.5
Performance Measurement on CPUs/GPUs System 247 13.6 Functional Performance
Models of Multiple Cores and GPUs 248 13.7 FPM-Based Data Partitioning on
CPUs/GPUs System 250 13.8 Efficient Building of Functional Performance
Models 251 13.9 FPM-Based Data Partitioning on Hierarchical Platforms 253
13.10 Conclusion 257 14. Efficient Multilevel Load Balancing on
Heterogeneous CPU + GPU Systems 261 Aleksandar Ilic and Leonel Sousa 14.1
Introduction: Heterogeneous CPU + GPU Systems 262 14.2 Background and
Related Work 265 14.3 Load Balancing Algorithms for Heterogeneous CPU + GPU
Systems 269 14.4 Experimental Results 275 14.5 Conclusions 279 15. The
All-Pair Shortest-Path Problem in Shared-Memory Heterogeneous Systems 283
Hector Ortega-Arranz, Yuri Torres, Diego R. Llanos, and Arturo
Gonzalez-Escribano 15.1 Introduction 283 15.2 Algorithmic Overview 285 15.3
CUDA Overview 287 15.4 Heterogeneous Systems and Load Balancing 288 15.5
Parallel Solutions to The APSP 289 15.6 Experimental Setup 291 15.7
Experimental Results 293 15.8 Conclusions 297 PART VI EFFICIENT
EXPLOITATION OF DISTRIBUTED SYSTEMS 301 16. Resource Management for HPC on
the Cloud 303 Marc E. Frincu and Dana Petcu 16.1 Introduction 303 16.2 On
the Type of Applications for HPC and HPC2 305 16.3 HPC on the Cloud 306
16.4 Scheduling Algorithms for HPC2 311 16.5 Toward an Autonomous
Scheduling Framework 312 16.6 Conclusions 319 17. Resource Discovery in
Large-Scale Grid Systems 323 Konstantinos Karaoglanoglou and Helen Karatza
17.1 Introduction and Background 323 17.2 The Semantic Communities Approach
325 17.3 The P2P Approach 329 17.4 The Grid-Routing Transferring Approach
333 17.5 Conclusions 337 PART VII ENERGY AWARENESS IN HIGH-PERFORMANCE
COMPUTING 341 18. Energy-Aware Approaches for HPC Systems 343 Robert
Basmadjian, Georges Da Costa, Ghislain Landry Tsafack Chetsa, Laurent
Lefevre, Ariel Oleksiak, and Jean-Marc Pierson 18.1 Introduction 344 18.2
Power Consumption of Servers 345 18.3 Classification and Energy Profiles of
HPC Applications 354 18.4 Policies and Leverages 359 18.5 Conclusion 360
19. Strategies for Increased Energy Awareness in Cloud Federations 365
Gabor Kecskemeti, AttilaKertesz, Attila Cs. Marosi, and Zsolt Nemeth 19.1
Introduction 365 19.2 Related Work 367 19.3 Scenarios 369 19.4 Energy-Aware
Cloud Federations 374 19.5 Conclusions 379 20. Enabling Network Security in
HPC Systems Using Heterogeneous CMPs 383 Ozcan Ozturk and Suleyman Tosun
20.1 Introduction 384 20.2 Related Work 386 20.3 Overview of Our Approach
387 20.4 Heterogeneous CMP Design for Network Security Processors 390 20.5
Experimental Evaluation 394 20.6 Concluding Remarks 397 PART VIII
APPLICATIONS OF HETEROGENEOUS HIGH-PERFORMANCE COMPUTING 401 21. Toward a
High-Performance Distributed CBIR System for Hyperspectral Remote Sensing
Data: A Case Study in Jungle Computing 403 Timo van Kessel, NielsDrost,
Jason Maassen, Henri E. Bal, Frank J. Seinstra, and Antonio J. Plaza 21.1
Introduction 404 21.2 CBIR For Hyperspectral Imaging Data 407 21.3 Jungle
Computing 410 21.4 IBIS and Constellation 412 21.5 System Design and
Implementation 415 21.6 Evaluation 420 21.7 Conclusions 426 22. Taking
Advantage of Heterogeneous Platforms in Image and Video Processing 429 Sidi
A. Mahmoudi, Erencan Ozkan, Pierre Manneback, and Suleyman Tosun 22.1
Introduction 430 22.2 Related Work 431 22.3 Parallel Image Processing on
GPU 433 22.4 Image Processing on Heterogeneous Architectures 437 22.5 Video
Processing on GPU 438 22.6 Experimental Results 444 22.7 Conclusion 447 23.
Real-Time Tomographic Reconstruction Through CPU + GPU Coprocessing 451
Jose Ignacio Agulleiro, Francisco Vazquez, Ester M. Garzon, and Jose J.
Fernandez 23.1 Introduction 452 23.2 Tomographic Reconstruction 453 23.3
Optimization of Tomographic Reconstruction for CPUs and for GPUs 455 23.4
Hybrid CPU + GPU Tomographic Reconstruction 457 23.5 Results 459 23.6
Discussion and Conclusion 461 Acknowledgments 463 References 463 Index 467
Open European Network for High-Performance Computing in Complex
Environments 3 Emmanuel Jeannot and Julius Zilinskas 1.1 Introduction and
Vision 4 1.2 Scientific Organization 6 1.3 Activities of the Project 6 1.4
Main Outcomes of the Action 7 1.5 Contents of the Book 8 PART II NUMERICAL
ANALYSIS FOR HETEROGENEOUS AND MULTICORE SYSTEMS 11 2. On the Impact of the
Heterogeneous Multicore and Many-Core Platforms on Iterative Solution
Methods and Preconditioning Techniques 13 Dimitar Lukarski and Maya
Neytcheva 2.1 Introduction 14 2.2 General Description of Iterative Methods
and Preconditioning 16 2.3 Preconditioning Techniques 20 2.4
Defect-Correction Technique 21 2.5 Multigrid Method 22 2.6 Parallelization
of Iterative Methods 22 2.7 Heterogeneous Systems 23 2.8 Maintenance and
Portability 29 2.9 Conclusion 30 3. Efficient Numerical Solution of 2D
Diffusion Equation on Multicore Computers 33 Matjaz Depolli, Gregor Kosec,
and Roman Trobec 3.1 Introduction 34 3.2 Test Case 35 3.3 Parallel
Implementation 39 3.4 Results 41 3.5 Discussion 45 3.6 Conclusion 47 4.
Parallel Algorithms for Parabolic Problems on Graphs in Neuroscience 51
Natalija Tumanova and Raimondas Ciegis 4.1 Introduction 51 4.2 Formulation
of the Discrete Model 53 4.3 Parallel Algorithms 59 4.4 Computational
Results 63 4.5 Conclusions 69 PART III COMMUNICATION AND STORAGE
CONSIDERATIONS IN HIGH-PERFORMANCE COMPUTING 73 5. An Overview of Topology
Mapping Algorithms and Techniques in High-Performance Computing 75 Torsten
Hoefler, Emmanuel Jeannot, and Guillaume Mercier 5.1 Introduction 76 5.2
General Overview 76 5.3 Formalization of the Problem 79 5.4 Algorithmic
Strategies for Topology Mapping 81 5.5 Mapping Enforcement Techniques 82
5.6 Survey of Solutions 85 5.7 Conclusion and Open Problems 89 6.
Optimization of Collective Communication for Heterogeneous HPC Platforms 95
Kiril Dichev and Alexey Lastovetsky 6.1 Introduction 95 6.2 Overview of
Optimized Collectives and Topology-Aware Collectives 97 6.3 Optimizations
of Collectives on Homogeneous Clusters 98 6.4 Heterogeneous Networks 99 6.5
Topology- and Performance-Aware Collectives 100 6.6 Topology as Input 101
6.7 Performance as Input 102 6.8 Non-MPI Collective Algorithms for
Heterogeneous Networks 106 6.9 Conclusion 111 7. Effective Data Access
Patterns on Massively Parallel Processors 115 Gabriele Capannini, Ranieri
Baraglia, Fabrizio Silvestri, and Franco Maria Nardini 7.1 Introduction 115
7.2 Architectural Details 116 7.3 K-Model 117 7.4 Parallel Prefix Sum 120
7.5 Bitonic Sorting Networks 126 7.6 Final Remarks 132 8. Scalable Storage
I/O Software for Blue Gene Architectures 135 Florin Isaila, Javier Garcia,
and Jesús Carretero 8.1 Introduction 135 8.2 Blue Gene System Overview 136
8.3 Design and Implementation 138 8.4 Conclusions and Future Work 142 PART
IV EFFICIENT EXPLOITATION OF HETEROGENEOUS ARCHITECTURES 145 9. Fair
Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous
Systems 147 Hamid Arabnejad, Jorge G. Barbosa, and Frédéric Suter 9.1
Introduction 148 9.2 Concurrent Workflow Scheduling 153 9.3 Experimental
Results and Discussion 160 9.4 Conclusions 165 10. Systematic Mapping of
Reed-Solomon Erasure Codes on Heterogeneous Multicore Architectures 169
Roman Wyrzykowski, Marcin Wozniak, and Lukasz Kuczynski 10.1 Introduction
169 10.2 Related Works 171 10.3 Reed-Solomon Codes and Linear Algebra
Algorithms 172 10.4 Mapping Reed-Solomon Codes on Cell/B.E. Architecture
173 10.5 Mapping Reed-Solomon Codes on Multicore GPU Architectures 178 10.6
Methods of Increasing the Algorithm Performance on GPUs 181 10.7 GPU
Performance Evaluation 185 10.8 Conclusions and Future Works 190 11.
Heterogeneous Parallel Computing Platforms and Tools for Compute-Intensive
Algorithms: A Case Study 193 Daniele D'Agostino, Andrea Clematis, and
Emanuele Danovaro 11.1 Introduction 194 11.2 A Low-Cost Heterogeneous
Computing Environment 196 11.3 First Case Study: The N-Body Problem 200
11.4 Second Case Study: The Convolution Algorithm 206 11.5 Conclusions 211
12. Efficient Application of Hybrid Parallelism in Electromagnetism
Problems 215 Alejandro Alvarez-Melcon, Fernando D. Quesada, Domingo
Gimenez, Carlos Pérez-Alcaraz, Jose-Gines Picon, and Tomas Ramírez 12.1
Introduction 215 12.2 Computation of Green's functions in Hybrid Systems
216 12.3 Parallelization in Numa Systems of a Volume Integral Equation
Technique 222 12.4 Autotuning Parallel Codes 226 12.5 Conclusions and
Future Research 230 PART V CPU + GPU COPROCESSING 235 13. Design and
Optimization of Scientific Applications for Highly Heterogeneous and
Hierarchical HPC Platforms Using Functional Computation Performance Models
237 David Clarke, Aleksandar Ilic, Alexey Lastovetsky, Vladimir Rychkov,
Leonel Sousa, and Ziming Zhong 13.1 Introduction 238 13.2 Related Work 241
13.3 Data Partitioning Based on Functional Performance Model 243 13.4
Example Application: Heterogeneous Parallel Matrix Multiplication 245 13.5
Performance Measurement on CPUs/GPUs System 247 13.6 Functional Performance
Models of Multiple Cores and GPUs 248 13.7 FPM-Based Data Partitioning on
CPUs/GPUs System 250 13.8 Efficient Building of Functional Performance
Models 251 13.9 FPM-Based Data Partitioning on Hierarchical Platforms 253
13.10 Conclusion 257 14. Efficient Multilevel Load Balancing on
Heterogeneous CPU + GPU Systems 261 Aleksandar Ilic and Leonel Sousa 14.1
Introduction: Heterogeneous CPU + GPU Systems 262 14.2 Background and
Related Work 265 14.3 Load Balancing Algorithms for Heterogeneous CPU + GPU
Systems 269 14.4 Experimental Results 275 14.5 Conclusions 279 15. The
All-Pair Shortest-Path Problem in Shared-Memory Heterogeneous Systems 283
Hector Ortega-Arranz, Yuri Torres, Diego R. Llanos, and Arturo
Gonzalez-Escribano 15.1 Introduction 283 15.2 Algorithmic Overview 285 15.3
CUDA Overview 287 15.4 Heterogeneous Systems and Load Balancing 288 15.5
Parallel Solutions to The APSP 289 15.6 Experimental Setup 291 15.7
Experimental Results 293 15.8 Conclusions 297 PART VI EFFICIENT
EXPLOITATION OF DISTRIBUTED SYSTEMS 301 16. Resource Management for HPC on
the Cloud 303 Marc E. Frincu and Dana Petcu 16.1 Introduction 303 16.2 On
the Type of Applications for HPC and HPC2 305 16.3 HPC on the Cloud 306
16.4 Scheduling Algorithms for HPC2 311 16.5 Toward an Autonomous
Scheduling Framework 312 16.6 Conclusions 319 17. Resource Discovery in
Large-Scale Grid Systems 323 Konstantinos Karaoglanoglou and Helen Karatza
17.1 Introduction and Background 323 17.2 The Semantic Communities Approach
325 17.3 The P2P Approach 329 17.4 The Grid-Routing Transferring Approach
333 17.5 Conclusions 337 PART VII ENERGY AWARENESS IN HIGH-PERFORMANCE
COMPUTING 341 18. Energy-Aware Approaches for HPC Systems 343 Robert
Basmadjian, Georges Da Costa, Ghislain Landry Tsafack Chetsa, Laurent
Lefevre, Ariel Oleksiak, and Jean-Marc Pierson 18.1 Introduction 344 18.2
Power Consumption of Servers 345 18.3 Classification and Energy Profiles of
HPC Applications 354 18.4 Policies and Leverages 359 18.5 Conclusion 360
19. Strategies for Increased Energy Awareness in Cloud Federations 365
Gabor Kecskemeti, AttilaKertesz, Attila Cs. Marosi, and Zsolt Nemeth 19.1
Introduction 365 19.2 Related Work 367 19.3 Scenarios 369 19.4 Energy-Aware
Cloud Federations 374 19.5 Conclusions 379 20. Enabling Network Security in
HPC Systems Using Heterogeneous CMPs 383 Ozcan Ozturk and Suleyman Tosun
20.1 Introduction 384 20.2 Related Work 386 20.3 Overview of Our Approach
387 20.4 Heterogeneous CMP Design for Network Security Processors 390 20.5
Experimental Evaluation 394 20.6 Concluding Remarks 397 PART VIII
APPLICATIONS OF HETEROGENEOUS HIGH-PERFORMANCE COMPUTING 401 21. Toward a
High-Performance Distributed CBIR System for Hyperspectral Remote Sensing
Data: A Case Study in Jungle Computing 403 Timo van Kessel, NielsDrost,
Jason Maassen, Henri E. Bal, Frank J. Seinstra, and Antonio J. Plaza 21.1
Introduction 404 21.2 CBIR For Hyperspectral Imaging Data 407 21.3 Jungle
Computing 410 21.4 IBIS and Constellation 412 21.5 System Design and
Implementation 415 21.6 Evaluation 420 21.7 Conclusions 426 22. Taking
Advantage of Heterogeneous Platforms in Image and Video Processing 429 Sidi
A. Mahmoudi, Erencan Ozkan, Pierre Manneback, and Suleyman Tosun 22.1
Introduction 430 22.2 Related Work 431 22.3 Parallel Image Processing on
GPU 433 22.4 Image Processing on Heterogeneous Architectures 437 22.5 Video
Processing on GPU 438 22.6 Experimental Results 444 22.7 Conclusion 447 23.
Real-Time Tomographic Reconstruction Through CPU + GPU Coprocessing 451
Jose Ignacio Agulleiro, Francisco Vazquez, Ester M. Garzon, and Jose J.
Fernandez 23.1 Introduction 452 23.2 Tomographic Reconstruction 453 23.3
Optimization of Tomographic Reconstruction for CPUs and for GPUs 455 23.4
Hybrid CPU + GPU Tomographic Reconstruction 457 23.5 Results 459 23.6
Discussion and Conclusion 461 Acknowledgments 463 References 463 Index 467
Contributors xxiii Preface xxvii PART I INTRODUCTION 1 1. Summary of the
Open European Network for High-Performance Computing in Complex
Environments 3 Emmanuel Jeannot and Julius Zilinskas 1.1 Introduction and
Vision 4 1.2 Scientific Organization 6 1.3 Activities of the Project 6 1.4
Main Outcomes of the Action 7 1.5 Contents of the Book 8 PART II NUMERICAL
ANALYSIS FOR HETEROGENEOUS AND MULTICORE SYSTEMS 11 2. On the Impact of the
Heterogeneous Multicore and Many-Core Platforms on Iterative Solution
Methods and Preconditioning Techniques 13 Dimitar Lukarski and Maya
Neytcheva 2.1 Introduction 14 2.2 General Description of Iterative Methods
and Preconditioning 16 2.3 Preconditioning Techniques 20 2.4
Defect-Correction Technique 21 2.5 Multigrid Method 22 2.6 Parallelization
of Iterative Methods 22 2.7 Heterogeneous Systems 23 2.8 Maintenance and
Portability 29 2.9 Conclusion 30 3. Efficient Numerical Solution of 2D
Diffusion Equation on Multicore Computers 33 Matjaz Depolli, Gregor Kosec,
and Roman Trobec 3.1 Introduction 34 3.2 Test Case 35 3.3 Parallel
Implementation 39 3.4 Results 41 3.5 Discussion 45 3.6 Conclusion 47 4.
Parallel Algorithms for Parabolic Problems on Graphs in Neuroscience 51
Natalija Tumanova and Raimondas Ciegis 4.1 Introduction 51 4.2 Formulation
of the Discrete Model 53 4.3 Parallel Algorithms 59 4.4 Computational
Results 63 4.5 Conclusions 69 PART III COMMUNICATION AND STORAGE
CONSIDERATIONS IN HIGH-PERFORMANCE COMPUTING 73 5. An Overview of Topology
Mapping Algorithms and Techniques in High-Performance Computing 75 Torsten
Hoefler, Emmanuel Jeannot, and Guillaume Mercier 5.1 Introduction 76 5.2
General Overview 76 5.3 Formalization of the Problem 79 5.4 Algorithmic
Strategies for Topology Mapping 81 5.5 Mapping Enforcement Techniques 82
5.6 Survey of Solutions 85 5.7 Conclusion and Open Problems 89 6.
Optimization of Collective Communication for Heterogeneous HPC Platforms 95
Kiril Dichev and Alexey Lastovetsky 6.1 Introduction 95 6.2 Overview of
Optimized Collectives and Topology-Aware Collectives 97 6.3 Optimizations
of Collectives on Homogeneous Clusters 98 6.4 Heterogeneous Networks 99 6.5
Topology- and Performance-Aware Collectives 100 6.6 Topology as Input 101
6.7 Performance as Input 102 6.8 Non-MPI Collective Algorithms for
Heterogeneous Networks 106 6.9 Conclusion 111 7. Effective Data Access
Patterns on Massively Parallel Processors 115 Gabriele Capannini, Ranieri
Baraglia, Fabrizio Silvestri, and Franco Maria Nardini 7.1 Introduction 115
7.2 Architectural Details 116 7.3 K-Model 117 7.4 Parallel Prefix Sum 120
7.5 Bitonic Sorting Networks 126 7.6 Final Remarks 132 8. Scalable Storage
I/O Software for Blue Gene Architectures 135 Florin Isaila, Javier Garcia,
and Jesús Carretero 8.1 Introduction 135 8.2 Blue Gene System Overview 136
8.3 Design and Implementation 138 8.4 Conclusions and Future Work 142 PART
IV EFFICIENT EXPLOITATION OF HETEROGENEOUS ARCHITECTURES 145 9. Fair
Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous
Systems 147 Hamid Arabnejad, Jorge G. Barbosa, and Frédéric Suter 9.1
Introduction 148 9.2 Concurrent Workflow Scheduling 153 9.3 Experimental
Results and Discussion 160 9.4 Conclusions 165 10. Systematic Mapping of
Reed-Solomon Erasure Codes on Heterogeneous Multicore Architectures 169
Roman Wyrzykowski, Marcin Wozniak, and Lukasz Kuczynski 10.1 Introduction
169 10.2 Related Works 171 10.3 Reed-Solomon Codes and Linear Algebra
Algorithms 172 10.4 Mapping Reed-Solomon Codes on Cell/B.E. Architecture
173 10.5 Mapping Reed-Solomon Codes on Multicore GPU Architectures 178 10.6
Methods of Increasing the Algorithm Performance on GPUs 181 10.7 GPU
Performance Evaluation 185 10.8 Conclusions and Future Works 190 11.
Heterogeneous Parallel Computing Platforms and Tools for Compute-Intensive
Algorithms: A Case Study 193 Daniele D'Agostino, Andrea Clematis, and
Emanuele Danovaro 11.1 Introduction 194 11.2 A Low-Cost Heterogeneous
Computing Environment 196 11.3 First Case Study: The N-Body Problem 200
11.4 Second Case Study: The Convolution Algorithm 206 11.5 Conclusions 211
12. Efficient Application of Hybrid Parallelism in Electromagnetism
Problems 215 Alejandro Alvarez-Melcon, Fernando D. Quesada, Domingo
Gimenez, Carlos Pérez-Alcaraz, Jose-Gines Picon, and Tomas Ramírez 12.1
Introduction 215 12.2 Computation of Green's functions in Hybrid Systems
216 12.3 Parallelization in Numa Systems of a Volume Integral Equation
Technique 222 12.4 Autotuning Parallel Codes 226 12.5 Conclusions and
Future Research 230 PART V CPU + GPU COPROCESSING 235 13. Design and
Optimization of Scientific Applications for Highly Heterogeneous and
Hierarchical HPC Platforms Using Functional Computation Performance Models
237 David Clarke, Aleksandar Ilic, Alexey Lastovetsky, Vladimir Rychkov,
Leonel Sousa, and Ziming Zhong 13.1 Introduction 238 13.2 Related Work 241
13.3 Data Partitioning Based on Functional Performance Model 243 13.4
Example Application: Heterogeneous Parallel Matrix Multiplication 245 13.5
Performance Measurement on CPUs/GPUs System 247 13.6 Functional Performance
Models of Multiple Cores and GPUs 248 13.7 FPM-Based Data Partitioning on
CPUs/GPUs System 250 13.8 Efficient Building of Functional Performance
Models 251 13.9 FPM-Based Data Partitioning on Hierarchical Platforms 253
13.10 Conclusion 257 14. Efficient Multilevel Load Balancing on
Heterogeneous CPU + GPU Systems 261 Aleksandar Ilic and Leonel Sousa 14.1
Introduction: Heterogeneous CPU + GPU Systems 262 14.2 Background and
Related Work 265 14.3 Load Balancing Algorithms for Heterogeneous CPU + GPU
Systems 269 14.4 Experimental Results 275 14.5 Conclusions 279 15. The
All-Pair Shortest-Path Problem in Shared-Memory Heterogeneous Systems 283
Hector Ortega-Arranz, Yuri Torres, Diego R. Llanos, and Arturo
Gonzalez-Escribano 15.1 Introduction 283 15.2 Algorithmic Overview 285 15.3
CUDA Overview 287 15.4 Heterogeneous Systems and Load Balancing 288 15.5
Parallel Solutions to The APSP 289 15.6 Experimental Setup 291 15.7
Experimental Results 293 15.8 Conclusions 297 PART VI EFFICIENT
EXPLOITATION OF DISTRIBUTED SYSTEMS 301 16. Resource Management for HPC on
the Cloud 303 Marc E. Frincu and Dana Petcu 16.1 Introduction 303 16.2 On
the Type of Applications for HPC and HPC2 305 16.3 HPC on the Cloud 306
16.4 Scheduling Algorithms for HPC2 311 16.5 Toward an Autonomous
Scheduling Framework 312 16.6 Conclusions 319 17. Resource Discovery in
Large-Scale Grid Systems 323 Konstantinos Karaoglanoglou and Helen Karatza
17.1 Introduction and Background 323 17.2 The Semantic Communities Approach
325 17.3 The P2P Approach 329 17.4 The Grid-Routing Transferring Approach
333 17.5 Conclusions 337 PART VII ENERGY AWARENESS IN HIGH-PERFORMANCE
COMPUTING 341 18. Energy-Aware Approaches for HPC Systems 343 Robert
Basmadjian, Georges Da Costa, Ghislain Landry Tsafack Chetsa, Laurent
Lefevre, Ariel Oleksiak, and Jean-Marc Pierson 18.1 Introduction 344 18.2
Power Consumption of Servers 345 18.3 Classification and Energy Profiles of
HPC Applications 354 18.4 Policies and Leverages 359 18.5 Conclusion 360
19. Strategies for Increased Energy Awareness in Cloud Federations 365
Gabor Kecskemeti, AttilaKertesz, Attila Cs. Marosi, and Zsolt Nemeth 19.1
Introduction 365 19.2 Related Work 367 19.3 Scenarios 369 19.4 Energy-Aware
Cloud Federations 374 19.5 Conclusions 379 20. Enabling Network Security in
HPC Systems Using Heterogeneous CMPs 383 Ozcan Ozturk and Suleyman Tosun
20.1 Introduction 384 20.2 Related Work 386 20.3 Overview of Our Approach
387 20.4 Heterogeneous CMP Design for Network Security Processors 390 20.5
Experimental Evaluation 394 20.6 Concluding Remarks 397 PART VIII
APPLICATIONS OF HETEROGENEOUS HIGH-PERFORMANCE COMPUTING 401 21. Toward a
High-Performance Distributed CBIR System for Hyperspectral Remote Sensing
Data: A Case Study in Jungle Computing 403 Timo van Kessel, NielsDrost,
Jason Maassen, Henri E. Bal, Frank J. Seinstra, and Antonio J. Plaza 21.1
Introduction 404 21.2 CBIR For Hyperspectral Imaging Data 407 21.3 Jungle
Computing 410 21.4 IBIS and Constellation 412 21.5 System Design and
Implementation 415 21.6 Evaluation 420 21.7 Conclusions 426 22. Taking
Advantage of Heterogeneous Platforms in Image and Video Processing 429 Sidi
A. Mahmoudi, Erencan Ozkan, Pierre Manneback, and Suleyman Tosun 22.1
Introduction 430 22.2 Related Work 431 22.3 Parallel Image Processing on
GPU 433 22.4 Image Processing on Heterogeneous Architectures 437 22.5 Video
Processing on GPU 438 22.6 Experimental Results 444 22.7 Conclusion 447 23.
Real-Time Tomographic Reconstruction Through CPU + GPU Coprocessing 451
Jose Ignacio Agulleiro, Francisco Vazquez, Ester M. Garzon, and Jose J.
Fernandez 23.1 Introduction 452 23.2 Tomographic Reconstruction 453 23.3
Optimization of Tomographic Reconstruction for CPUs and for GPUs 455 23.4
Hybrid CPU + GPU Tomographic Reconstruction 457 23.5 Results 459 23.6
Discussion and Conclusion 461 Acknowledgments 463 References 463 Index 467
Open European Network for High-Performance Computing in Complex
Environments 3 Emmanuel Jeannot and Julius Zilinskas 1.1 Introduction and
Vision 4 1.2 Scientific Organization 6 1.3 Activities of the Project 6 1.4
Main Outcomes of the Action 7 1.5 Contents of the Book 8 PART II NUMERICAL
ANALYSIS FOR HETEROGENEOUS AND MULTICORE SYSTEMS 11 2. On the Impact of the
Heterogeneous Multicore and Many-Core Platforms on Iterative Solution
Methods and Preconditioning Techniques 13 Dimitar Lukarski and Maya
Neytcheva 2.1 Introduction 14 2.2 General Description of Iterative Methods
and Preconditioning 16 2.3 Preconditioning Techniques 20 2.4
Defect-Correction Technique 21 2.5 Multigrid Method 22 2.6 Parallelization
of Iterative Methods 22 2.7 Heterogeneous Systems 23 2.8 Maintenance and
Portability 29 2.9 Conclusion 30 3. Efficient Numerical Solution of 2D
Diffusion Equation on Multicore Computers 33 Matjaz Depolli, Gregor Kosec,
and Roman Trobec 3.1 Introduction 34 3.2 Test Case 35 3.3 Parallel
Implementation 39 3.4 Results 41 3.5 Discussion 45 3.6 Conclusion 47 4.
Parallel Algorithms for Parabolic Problems on Graphs in Neuroscience 51
Natalija Tumanova and Raimondas Ciegis 4.1 Introduction 51 4.2 Formulation
of the Discrete Model 53 4.3 Parallel Algorithms 59 4.4 Computational
Results 63 4.5 Conclusions 69 PART III COMMUNICATION AND STORAGE
CONSIDERATIONS IN HIGH-PERFORMANCE COMPUTING 73 5. An Overview of Topology
Mapping Algorithms and Techniques in High-Performance Computing 75 Torsten
Hoefler, Emmanuel Jeannot, and Guillaume Mercier 5.1 Introduction 76 5.2
General Overview 76 5.3 Formalization of the Problem 79 5.4 Algorithmic
Strategies for Topology Mapping 81 5.5 Mapping Enforcement Techniques 82
5.6 Survey of Solutions 85 5.7 Conclusion and Open Problems 89 6.
Optimization of Collective Communication for Heterogeneous HPC Platforms 95
Kiril Dichev and Alexey Lastovetsky 6.1 Introduction 95 6.2 Overview of
Optimized Collectives and Topology-Aware Collectives 97 6.3 Optimizations
of Collectives on Homogeneous Clusters 98 6.4 Heterogeneous Networks 99 6.5
Topology- and Performance-Aware Collectives 100 6.6 Topology as Input 101
6.7 Performance as Input 102 6.8 Non-MPI Collective Algorithms for
Heterogeneous Networks 106 6.9 Conclusion 111 7. Effective Data Access
Patterns on Massively Parallel Processors 115 Gabriele Capannini, Ranieri
Baraglia, Fabrizio Silvestri, and Franco Maria Nardini 7.1 Introduction 115
7.2 Architectural Details 116 7.3 K-Model 117 7.4 Parallel Prefix Sum 120
7.5 Bitonic Sorting Networks 126 7.6 Final Remarks 132 8. Scalable Storage
I/O Software for Blue Gene Architectures 135 Florin Isaila, Javier Garcia,
and Jesús Carretero 8.1 Introduction 135 8.2 Blue Gene System Overview 136
8.3 Design and Implementation 138 8.4 Conclusions and Future Work 142 PART
IV EFFICIENT EXPLOITATION OF HETEROGENEOUS ARCHITECTURES 145 9. Fair
Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous
Systems 147 Hamid Arabnejad, Jorge G. Barbosa, and Frédéric Suter 9.1
Introduction 148 9.2 Concurrent Workflow Scheduling 153 9.3 Experimental
Results and Discussion 160 9.4 Conclusions 165 10. Systematic Mapping of
Reed-Solomon Erasure Codes on Heterogeneous Multicore Architectures 169
Roman Wyrzykowski, Marcin Wozniak, and Lukasz Kuczynski 10.1 Introduction
169 10.2 Related Works 171 10.3 Reed-Solomon Codes and Linear Algebra
Algorithms 172 10.4 Mapping Reed-Solomon Codes on Cell/B.E. Architecture
173 10.5 Mapping Reed-Solomon Codes on Multicore GPU Architectures 178 10.6
Methods of Increasing the Algorithm Performance on GPUs 181 10.7 GPU
Performance Evaluation 185 10.8 Conclusions and Future Works 190 11.
Heterogeneous Parallel Computing Platforms and Tools for Compute-Intensive
Algorithms: A Case Study 193 Daniele D'Agostino, Andrea Clematis, and
Emanuele Danovaro 11.1 Introduction 194 11.2 A Low-Cost Heterogeneous
Computing Environment 196 11.3 First Case Study: The N-Body Problem 200
11.4 Second Case Study: The Convolution Algorithm 206 11.5 Conclusions 211
12. Efficient Application of Hybrid Parallelism in Electromagnetism
Problems 215 Alejandro Alvarez-Melcon, Fernando D. Quesada, Domingo
Gimenez, Carlos Pérez-Alcaraz, Jose-Gines Picon, and Tomas Ramírez 12.1
Introduction 215 12.2 Computation of Green's functions in Hybrid Systems
216 12.3 Parallelization in Numa Systems of a Volume Integral Equation
Technique 222 12.4 Autotuning Parallel Codes 226 12.5 Conclusions and
Future Research 230 PART V CPU + GPU COPROCESSING 235 13. Design and
Optimization of Scientific Applications for Highly Heterogeneous and
Hierarchical HPC Platforms Using Functional Computation Performance Models
237 David Clarke, Aleksandar Ilic, Alexey Lastovetsky, Vladimir Rychkov,
Leonel Sousa, and Ziming Zhong 13.1 Introduction 238 13.2 Related Work 241
13.3 Data Partitioning Based on Functional Performance Model 243 13.4
Example Application: Heterogeneous Parallel Matrix Multiplication 245 13.5
Performance Measurement on CPUs/GPUs System 247 13.6 Functional Performance
Models of Multiple Cores and GPUs 248 13.7 FPM-Based Data Partitioning on
CPUs/GPUs System 250 13.8 Efficient Building of Functional Performance
Models 251 13.9 FPM-Based Data Partitioning on Hierarchical Platforms 253
13.10 Conclusion 257 14. Efficient Multilevel Load Balancing on
Heterogeneous CPU + GPU Systems 261 Aleksandar Ilic and Leonel Sousa 14.1
Introduction: Heterogeneous CPU + GPU Systems 262 14.2 Background and
Related Work 265 14.3 Load Balancing Algorithms for Heterogeneous CPU + GPU
Systems 269 14.4 Experimental Results 275 14.5 Conclusions 279 15. The
All-Pair Shortest-Path Problem in Shared-Memory Heterogeneous Systems 283
Hector Ortega-Arranz, Yuri Torres, Diego R. Llanos, and Arturo
Gonzalez-Escribano 15.1 Introduction 283 15.2 Algorithmic Overview 285 15.3
CUDA Overview 287 15.4 Heterogeneous Systems and Load Balancing 288 15.5
Parallel Solutions to The APSP 289 15.6 Experimental Setup 291 15.7
Experimental Results 293 15.8 Conclusions 297 PART VI EFFICIENT
EXPLOITATION OF DISTRIBUTED SYSTEMS 301 16. Resource Management for HPC on
the Cloud 303 Marc E. Frincu and Dana Petcu 16.1 Introduction 303 16.2 On
the Type of Applications for HPC and HPC2 305 16.3 HPC on the Cloud 306
16.4 Scheduling Algorithms for HPC2 311 16.5 Toward an Autonomous
Scheduling Framework 312 16.6 Conclusions 319 17. Resource Discovery in
Large-Scale Grid Systems 323 Konstantinos Karaoglanoglou and Helen Karatza
17.1 Introduction and Background 323 17.2 The Semantic Communities Approach
325 17.3 The P2P Approach 329 17.4 The Grid-Routing Transferring Approach
333 17.5 Conclusions 337 PART VII ENERGY AWARENESS IN HIGH-PERFORMANCE
COMPUTING 341 18. Energy-Aware Approaches for HPC Systems 343 Robert
Basmadjian, Georges Da Costa, Ghislain Landry Tsafack Chetsa, Laurent
Lefevre, Ariel Oleksiak, and Jean-Marc Pierson 18.1 Introduction 344 18.2
Power Consumption of Servers 345 18.3 Classification and Energy Profiles of
HPC Applications 354 18.4 Policies and Leverages 359 18.5 Conclusion 360
19. Strategies for Increased Energy Awareness in Cloud Federations 365
Gabor Kecskemeti, AttilaKertesz, Attila Cs. Marosi, and Zsolt Nemeth 19.1
Introduction 365 19.2 Related Work 367 19.3 Scenarios 369 19.4 Energy-Aware
Cloud Federations 374 19.5 Conclusions 379 20. Enabling Network Security in
HPC Systems Using Heterogeneous CMPs 383 Ozcan Ozturk and Suleyman Tosun
20.1 Introduction 384 20.2 Related Work 386 20.3 Overview of Our Approach
387 20.4 Heterogeneous CMP Design for Network Security Processors 390 20.5
Experimental Evaluation 394 20.6 Concluding Remarks 397 PART VIII
APPLICATIONS OF HETEROGENEOUS HIGH-PERFORMANCE COMPUTING 401 21. Toward a
High-Performance Distributed CBIR System for Hyperspectral Remote Sensing
Data: A Case Study in Jungle Computing 403 Timo van Kessel, NielsDrost,
Jason Maassen, Henri E. Bal, Frank J. Seinstra, and Antonio J. Plaza 21.1
Introduction 404 21.2 CBIR For Hyperspectral Imaging Data 407 21.3 Jungle
Computing 410 21.4 IBIS and Constellation 412 21.5 System Design and
Implementation 415 21.6 Evaluation 420 21.7 Conclusions 426 22. Taking
Advantage of Heterogeneous Platforms in Image and Video Processing 429 Sidi
A. Mahmoudi, Erencan Ozkan, Pierre Manneback, and Suleyman Tosun 22.1
Introduction 430 22.2 Related Work 431 22.3 Parallel Image Processing on
GPU 433 22.4 Image Processing on Heterogeneous Architectures 437 22.5 Video
Processing on GPU 438 22.6 Experimental Results 444 22.7 Conclusion 447 23.
Real-Time Tomographic Reconstruction Through CPU + GPU Coprocessing 451
Jose Ignacio Agulleiro, Francisco Vazquez, Ester M. Garzon, and Jose J.
Fernandez 23.1 Introduction 452 23.2 Tomographic Reconstruction 453 23.3
Optimization of Tomographic Reconstruction for CPUs and for GPUs 455 23.4
Hybrid CPU + GPU Tomographic Reconstruction 457 23.5 Results 459 23.6
Discussion and Conclusion 461 Acknowledgments 463 References 463 Index 467