- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
The book provides a detailed overview of a parallel programming approach for massively parallel processors, OpenACC. The book introduces individual feature sets of OpenACC, with exercises and case studies showcasing the usage of the language constructs and also offers valuable insights into writing an efficient OpenACC program. It also explains how OpenACC language constructs are translated in order to achieve application performance. Chapters on parallelization, optimization strategies and best programming practices are also included.
The book provides a detailed overview of a parallel programming approach for massively parallel processors, OpenACC. The book introduces individual feature sets of OpenACC, with exercises and case studies showcasing the usage of the language constructs and also offers valuable insights into writing an efficient OpenACC program. It also explains how OpenACC language constructs are translated in order to achieve application performance. Chapters on parallelization, optimization strategies and best programming practices are also included.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Verlag: Addison-Wesley / Pearson Deutschland GmbH
- Seitenzahl: 320
- Erscheinungstermin: 10. September 2017
- Englisch
- Abmessung: 231mm x 189mm x 22mm
- Gewicht: 540g
- ISBN-13: 9780134694283
- ISBN-10: 0134694287
- Artikelnr.: 48318847
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
- Verlag: Addison-Wesley / Pearson Deutschland GmbH
- Seitenzahl: 320
- Erscheinungstermin: 10. September 2017
- Englisch
- Abmessung: 231mm x 189mm x 22mm
- Gewicht: 540g
- ISBN-13: 9780134694283
- ISBN-10: 0134694287
- Artikelnr.: 48318847
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
Sunita Chandrasekaran is assistant professor in the Computer and Information Sciences Department at the University of Delaware. Her research interests include exploring the suitability of high-level programming models and runtime systems for HPC and embedded platforms, and migrating scientific applications to heterogeneous computing systems. Dr. Chandrasekaran was a post-doctoral fellow at the University of Houston and holds a Ph.D. from Nanyang Technological University, Singapore. She is a member of OpenACC, OpenMP, MCA and SPEC HPG. She has served on the program committees of various conferences and workshops including SC, ISC, ICPP, CCGrid, Cluster, and PACT, and has co-chaired parallel programming workshops co-located with SC, ISC, IPDPS, and SIAM. Guido Juckeland is head of the Computational Science Group, Department for Information Services and Computing, Helmholtz-Zentrum Dresden-Rossendorf, and coordinates the work of the GPU Center of Excellence at Dresden. He and also represents HZDR at the SPEC High Performance Group and OpenACC committee. He received his Ph.D. from Technische Universität Dresden for his work on performance analysis for hardware accelerators. He was a Gordon Bell Award Finalist in 2013. Previously he worked as the IT-architect and post-doctoral researcher for the Center for Information Services and High Performance Computing (ZIH) at TU Dresden, Germany. He has served on the program committees of various conferences and workshops, including ISC, EuroPar, CCGrid, ASHES, P^3MA, PMBS, WACCPD, and PACT, and has co-chaired parallel programming workshops co-located with SC.
Foreword xv
Preface xxi
Acknowledgments xxiii
About the Contributors xxv
Chapter 1: OpenACC in a Nutshell 1
1.1 OpenACC Syntax 3
1.2 Compute Constructs 6
1.3 The Data Environment 11
1.4 Summary 15
1.5 Exercises 15
Chapter 2: Loop-Level Parallelism 17
2.1 Kernels Versus Parallel Loops 18
2.2 Three Levels of Parallelism 21
2.3 Other Loop Constructs 24
2.4 Summary 30
2.5 Exercises 31
Chapter 3: Programming Tools for OpenACC 33
3.1 Common Characteristics of Architectures 34
3.2 Compiling OpenACC Code 35
3.3 Performance Analysis of OpenACC Applications 36
3.4 Identifying Bugs in OpenACC Programs 51
3.5 Summary 53
3.6 Exercises 54
Chapter 4: Using OpenACC for Your First Program 59
4.1 Case Study 59
4.2 Creating a Naive Parallel Version 68
4.3 Performance of OpenACC Programs 71
4.4 An Optimized Parallel Version 73
4.5 Summary 78
4.6 Exercises 79
Chapter 5: Compiling OpenACC 81
5.1 The Challenges of Parallelism 82
5.2 Restructuring Compilers 88
5.3 Compiling OpenACC 92
5.4 Summary 97
5.5 Exercises 97
Chapter 6: Best Programming Practices 101
6.1 General Guidelines 102
6.2 Maximize On-Device Compute 105
6.3 Optimize Data Locality 108
6.4 A Representative Example 112
6.5 Summary 118
6.6 Exercises 119
Chapter 7: OpenACC and Performance Portability 121
7.1 Challenges 121
7.2 Target Architectures 123
7.3 OpenACC for Performance Portability 124
7.4 Code Refactoring for Performance Portability126
7.5 Summary 132
7.6 Exercises133
Chapter 8: Additional Approaches to Parallel Programming 135
8.1 Programming Models135
8.2 Programming Model Components142
8.3 A Case Study 155
8.4 Summary170
8.5 Exercises170
Chapter 9: OpenACC and Interoperability 173
9.1 Calling Native Device Code from OpenACC 174
9.2 Calling OpenACC from Native Device Code 181
9.3 Advanced Interoperability Topics 182
9.4 Summary185
9.5 Exercises185
Chapter 10: Advanced OpenACC 187
10.1 Asynchronous Operations 187
10.2 Multidevice Programming 204
10.3 Summary 213
10.4 Exercises 213
Chapter 11: Innovative Research Ideas Using OpenACC, Part I 215
11.1 Sunway OpenACC 215
11.2 Compiler Transformation of Nested Loops for Accelerators 224
Chapter 12: Innovative Research Ideas Using OpenACC, Part II 237
12.1 A Framework for Directive-Based High-Performance Reconfigurable
Computing 237
12.2 Programming Accelerated Clusters Using XcalableACC 253
Index 269
Preface xxi
Acknowledgments xxiii
About the Contributors xxv
Chapter 1: OpenACC in a Nutshell 1
1.1 OpenACC Syntax 3
1.2 Compute Constructs 6
1.3 The Data Environment 11
1.4 Summary 15
1.5 Exercises 15
Chapter 2: Loop-Level Parallelism 17
2.1 Kernels Versus Parallel Loops 18
2.2 Three Levels of Parallelism 21
2.3 Other Loop Constructs 24
2.4 Summary 30
2.5 Exercises 31
Chapter 3: Programming Tools for OpenACC 33
3.1 Common Characteristics of Architectures 34
3.2 Compiling OpenACC Code 35
3.3 Performance Analysis of OpenACC Applications 36
3.4 Identifying Bugs in OpenACC Programs 51
3.5 Summary 53
3.6 Exercises 54
Chapter 4: Using OpenACC for Your First Program 59
4.1 Case Study 59
4.2 Creating a Naive Parallel Version 68
4.3 Performance of OpenACC Programs 71
4.4 An Optimized Parallel Version 73
4.5 Summary 78
4.6 Exercises 79
Chapter 5: Compiling OpenACC 81
5.1 The Challenges of Parallelism 82
5.2 Restructuring Compilers 88
5.3 Compiling OpenACC 92
5.4 Summary 97
5.5 Exercises 97
Chapter 6: Best Programming Practices 101
6.1 General Guidelines 102
6.2 Maximize On-Device Compute 105
6.3 Optimize Data Locality 108
6.4 A Representative Example 112
6.5 Summary 118
6.6 Exercises 119
Chapter 7: OpenACC and Performance Portability 121
7.1 Challenges 121
7.2 Target Architectures 123
7.3 OpenACC for Performance Portability 124
7.4 Code Refactoring for Performance Portability126
7.5 Summary 132
7.6 Exercises133
Chapter 8: Additional Approaches to Parallel Programming 135
8.1 Programming Models135
8.2 Programming Model Components142
8.3 A Case Study 155
8.4 Summary170
8.5 Exercises170
Chapter 9: OpenACC and Interoperability 173
9.1 Calling Native Device Code from OpenACC 174
9.2 Calling OpenACC from Native Device Code 181
9.3 Advanced Interoperability Topics 182
9.4 Summary185
9.5 Exercises185
Chapter 10: Advanced OpenACC 187
10.1 Asynchronous Operations 187
10.2 Multidevice Programming 204
10.3 Summary 213
10.4 Exercises 213
Chapter 11: Innovative Research Ideas Using OpenACC, Part I 215
11.1 Sunway OpenACC 215
11.2 Compiler Transformation of Nested Loops for Accelerators 224
Chapter 12: Innovative Research Ideas Using OpenACC, Part II 237
12.1 A Framework for Directive-Based High-Performance Reconfigurable
Computing 237
12.2 Programming Accelerated Clusters Using XcalableACC 253
Index 269
Foreword xv
Preface xxi
Acknowledgments xxiii
About the Contributors xxv
Chapter 1: OpenACC in a Nutshell 1
1.1 OpenACC Syntax 3
1.2 Compute Constructs 6
1.3 The Data Environment 11
1.4 Summary 15
1.5 Exercises 15
Chapter 2: Loop-Level Parallelism 17
2.1 Kernels Versus Parallel Loops 18
2.2 Three Levels of Parallelism 21
2.3 Other Loop Constructs 24
2.4 Summary 30
2.5 Exercises 31
Chapter 3: Programming Tools for OpenACC 33
3.1 Common Characteristics of Architectures 34
3.2 Compiling OpenACC Code 35
3.3 Performance Analysis of OpenACC Applications 36
3.4 Identifying Bugs in OpenACC Programs 51
3.5 Summary 53
3.6 Exercises 54
Chapter 4: Using OpenACC for Your First Program 59
4.1 Case Study 59
4.2 Creating a Naive Parallel Version 68
4.3 Performance of OpenACC Programs 71
4.4 An Optimized Parallel Version 73
4.5 Summary 78
4.6 Exercises 79
Chapter 5: Compiling OpenACC 81
5.1 The Challenges of Parallelism 82
5.2 Restructuring Compilers 88
5.3 Compiling OpenACC 92
5.4 Summary 97
5.5 Exercises 97
Chapter 6: Best Programming Practices 101
6.1 General Guidelines 102
6.2 Maximize On-Device Compute 105
6.3 Optimize Data Locality 108
6.4 A Representative Example 112
6.5 Summary 118
6.6 Exercises 119
Chapter 7: OpenACC and Performance Portability 121
7.1 Challenges 121
7.2 Target Architectures 123
7.3 OpenACC for Performance Portability 124
7.4 Code Refactoring for Performance Portability126
7.5 Summary 132
7.6 Exercises133
Chapter 8: Additional Approaches to Parallel Programming 135
8.1 Programming Models135
8.2 Programming Model Components142
8.3 A Case Study 155
8.4 Summary170
8.5 Exercises170
Chapter 9: OpenACC and Interoperability 173
9.1 Calling Native Device Code from OpenACC 174
9.2 Calling OpenACC from Native Device Code 181
9.3 Advanced Interoperability Topics 182
9.4 Summary185
9.5 Exercises185
Chapter 10: Advanced OpenACC 187
10.1 Asynchronous Operations 187
10.2 Multidevice Programming 204
10.3 Summary 213
10.4 Exercises 213
Chapter 11: Innovative Research Ideas Using OpenACC, Part I 215
11.1 Sunway OpenACC 215
11.2 Compiler Transformation of Nested Loops for Accelerators 224
Chapter 12: Innovative Research Ideas Using OpenACC, Part II 237
12.1 A Framework for Directive-Based High-Performance Reconfigurable
Computing 237
12.2 Programming Accelerated Clusters Using XcalableACC 253
Index 269
Preface xxi
Acknowledgments xxiii
About the Contributors xxv
Chapter 1: OpenACC in a Nutshell 1
1.1 OpenACC Syntax 3
1.2 Compute Constructs 6
1.3 The Data Environment 11
1.4 Summary 15
1.5 Exercises 15
Chapter 2: Loop-Level Parallelism 17
2.1 Kernels Versus Parallel Loops 18
2.2 Three Levels of Parallelism 21
2.3 Other Loop Constructs 24
2.4 Summary 30
2.5 Exercises 31
Chapter 3: Programming Tools for OpenACC 33
3.1 Common Characteristics of Architectures 34
3.2 Compiling OpenACC Code 35
3.3 Performance Analysis of OpenACC Applications 36
3.4 Identifying Bugs in OpenACC Programs 51
3.5 Summary 53
3.6 Exercises 54
Chapter 4: Using OpenACC for Your First Program 59
4.1 Case Study 59
4.2 Creating a Naive Parallel Version 68
4.3 Performance of OpenACC Programs 71
4.4 An Optimized Parallel Version 73
4.5 Summary 78
4.6 Exercises 79
Chapter 5: Compiling OpenACC 81
5.1 The Challenges of Parallelism 82
5.2 Restructuring Compilers 88
5.3 Compiling OpenACC 92
5.4 Summary 97
5.5 Exercises 97
Chapter 6: Best Programming Practices 101
6.1 General Guidelines 102
6.2 Maximize On-Device Compute 105
6.3 Optimize Data Locality 108
6.4 A Representative Example 112
6.5 Summary 118
6.6 Exercises 119
Chapter 7: OpenACC and Performance Portability 121
7.1 Challenges 121
7.2 Target Architectures 123
7.3 OpenACC for Performance Portability 124
7.4 Code Refactoring for Performance Portability126
7.5 Summary 132
7.6 Exercises133
Chapter 8: Additional Approaches to Parallel Programming 135
8.1 Programming Models135
8.2 Programming Model Components142
8.3 A Case Study 155
8.4 Summary170
8.5 Exercises170
Chapter 9: OpenACC and Interoperability 173
9.1 Calling Native Device Code from OpenACC 174
9.2 Calling OpenACC from Native Device Code 181
9.3 Advanced Interoperability Topics 182
9.4 Summary185
9.5 Exercises185
Chapter 10: Advanced OpenACC 187
10.1 Asynchronous Operations 187
10.2 Multidevice Programming 204
10.3 Summary 213
10.4 Exercises 213
Chapter 11: Innovative Research Ideas Using OpenACC, Part I 215
11.1 Sunway OpenACC 215
11.2 Compiler Transformation of Nested Loops for Accelerators 224
Chapter 12: Innovative Research Ideas Using OpenACC, Part II 237
12.1 A Framework for Directive-Based High-Performance Reconfigurable
Computing 237
12.2 Programming Accelerated Clusters Using XcalableACC 253
Index 269