- Gebundenes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
This book presents the most recent advances in parallel and distributed computing from experts in the field. Serves as a unique format for professionals to present, discuss, and exchange their recent advances, new ideas, results, works-in-progress, and experiences in the areas of parallel and distributed computing for science and engineering applications. All chapters written with introduction, detailed background, and in-depth discussion.
The state of the art of high-performance computing
Prominent researchers from around the world have gathered to present the state-of-the-art…mehr
Andere Kunden interessierten sich auch für
- Oliver SinnenTask Scheduling for Parallel Systems149,99 €
- Roger R. DubeHardware-Based Computer Security Techniques to Defeat Hackers158,99 €
- Safety of Computer Architectures194,99 €
- Pierre BonnetSustainable It Architecture194,99 €
- Norman BalabanianDigital Logic Design Principles353,99 €
- Harnessing Green It139,99 €
- Muhammad SarfrazAdvances in Geometric Modeling169,99 €
-
-
-
This book presents the most recent advances in parallel and distributed computing from experts in the field. Serves as a unique format for professionals to present, discuss, and exchange their recent advances, new ideas, results, works-in-progress, and experiences in the areas of parallel and distributed computing for science and engineering applications. All chapters written with introduction, detailed background, and in-depth discussion.
The state of the art of high-performance computing
Prominent researchers from around the world have gathered to present the state-of-the-art techniques and innovations in high-performance computing (HPC), including:
_ Programming models for parallel computing: graph-oriented programming (GOP), OpenMP, the stages and transformation (SAT) approach, the bulk-synchronous parallel (BSP) model, Message Passing Interface (MPI), and Cilk
_ Architectural and system support, featuring the code tiling compiler technique, the MigThread application-level migration and checkpointing package, the new prefetching scheme of atomicity, a new "receiver makes right" data conversion method, and lessons learned from applying reconfigurable computing to HPC
_ Scheduling and resource management issues with heterogeneous systems, bus saturation effects on SMPs, genetic algorithms for distributed computing, and novel task-scheduling algorithms
_ Clusters and grid computing: design requirements, grid middleware, distributed virtual machines, data grid services and performance-boosting techniques, security issues, and open issues
_ Peer-to-peer computing (P2P) including the proposed search mechanism of hybrid periodical flooding (HPF) and routing protocols for improved routing performance
_ Wireless and mobile computing, featuring discussions of implementing the Gateway Location Register (GLR) concept in 3G cellular networks, maximizing network longevity, and comparisons of QoS-aware scatternet scheduling algorithms
_ High-performance applications including partitioners, running Bag-of-Tasks applications on grids, using low-cost clusters to meet high-demand applications, and advanced convergent architectures and protocols
High-Performance Computing: Paradigm and Infrastructure is an invaluable compendium for engineers, IT professionals, and researchers and students of computer science and applied mathematics.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
The state of the art of high-performance computing
Prominent researchers from around the world have gathered to present the state-of-the-art techniques and innovations in high-performance computing (HPC), including:
_ Programming models for parallel computing: graph-oriented programming (GOP), OpenMP, the stages and transformation (SAT) approach, the bulk-synchronous parallel (BSP) model, Message Passing Interface (MPI), and Cilk
_ Architectural and system support, featuring the code tiling compiler technique, the MigThread application-level migration and checkpointing package, the new prefetching scheme of atomicity, a new "receiver makes right" data conversion method, and lessons learned from applying reconfigurable computing to HPC
_ Scheduling and resource management issues with heterogeneous systems, bus saturation effects on SMPs, genetic algorithms for distributed computing, and novel task-scheduling algorithms
_ Clusters and grid computing: design requirements, grid middleware, distributed virtual machines, data grid services and performance-boosting techniques, security issues, and open issues
_ Peer-to-peer computing (P2P) including the proposed search mechanism of hybrid periodical flooding (HPF) and routing protocols for improved routing performance
_ Wireless and mobile computing, featuring discussions of implementing the Gateway Location Register (GLR) concept in 3G cellular networks, maximizing network longevity, and comparisons of QoS-aware scatternet scheduling algorithms
_ High-performance applications including partitioners, running Bag-of-Tasks applications on grids, using low-cost clusters to meet high-demand applications, and advanced convergent architectures and protocols
High-Performance Computing: Paradigm and Infrastructure is an invaluable compendium for engineers, IT professionals, and researchers and students of computer science and applied mathematics.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- Artikelnr. des Verlages: 14665471000
- 1. Auflage
- Seitenzahl: 816
- Erscheinungstermin: 1. August 2005
- Englisch
- Abmessung: 242mm x 161mm x 42mm
- Gewicht: 1215g
- ISBN-13: 9780471654711
- ISBN-10: 047165471X
- Artikelnr.: 12987402
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- Artikelnr. des Verlages: 14665471000
- 1. Auflage
- Seitenzahl: 816
- Erscheinungstermin: 1. August 2005
- Englisch
- Abmessung: 242mm x 161mm x 42mm
- Gewicht: 1215g
- ISBN-13: 9780471654711
- ISBN-10: 047165471X
- Artikelnr.: 12987402
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
LAURENCE T. YANG is a Professor of Computer Science, St. Francis Xavier University, Canada. Dr. Yang served as the vice chair of IEEE Technical Committee of Supercomputing Applications (TCSA) until 2004 and as an executive committee member of the IEEE Technical Committee of Scalable Computing (TCSC) since 2004. Dr. Yang has also received many awards, including the Distinguished Contribution Award, 2004; Technical Achievement Award, 2004; Outstanding Achievement Award, 2002, University Research/Publication/Teaching Award, 2000?2001/2002?2003/2003?2004, and Canada Foundation for Innovation (CFI) Award, 2003. MINYI GUO received his PhD from the University of Tsukuba, Japan. He is currently an Associate Professor in the Department of Computer Software at the University of Aizu, Japan. In addition, Dr. Guo is Editor in Chief of the International Journal of Embedded Systems, and has written and edited books in the area of parallel and distributed computing, as well as embedded and ubiquitous computing.
Preface.
Contributors.
PART 1. PROGRAMMING MODEL.
1. ClusterGOP: A High-Level Programming Environment for Clusters (Fan Chan,
Jiannong Cao and Minyi Guo).
1.1 Introduction.
1.2 GOP Model and ClusterGOP Architecture.
1.3 VisualGOP.
1.4 The ClusterGOP Library.
1.5 MPMD Programming Support.
1.6 Programming Using ClusterGOP.
1.7 Summary.
2. The Challenge of Providing A High-Level Programming Model for
High-Performance Computing (Barbara Chapman).
2.1 Introduction.
2.2 HPC Architectures.
2.3 HPC Programming Models: The First Generation.
2.4 The Second generation of HPC Programming Models.
2.5 OpenMP for DMPs.
2.6 Experiments with OpenMP on DMPs.
2.7 Conclusions.
3. SAT: Toward Structured Parallelism Using Skeletons (Sergei Gorlatch).
3.1 Introduction.
3.2 SAT: A Methodology Outline.
3.3 Skeletons and Collective Operations.
3.4 Case Study: Maximum Segment SUM (MSS).
3.5 Performance Aspect in SAT.
3.6 Conclusions and Related Work.
4. Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance
Computing (Alexander Tiskin).
4.1 The BSP Model.
4.2 BSP Programming.
4.3 Conclusions.
5. Cilk Versus MPI: Comparing Two Parallel Programming Styles on
Heterogenous Systems (John Morris, KyuHo Lee and JunSeong Kim).
5.1 Introduction.
5.2 Experiments.
5.3 Results.
5.4 Conclusion.
6. Nested Parallelism and Pipelining in OpenMP (Marc Gonzalez, E. Ayguade,
X. Martorell and J. Labarta).
6.1 Introduction.
6.2 OpenMP Extensions for Nested Parallelism.
6.3 OpenMP Extensions for Thread Synchronization.
6.4 Summary.
7. OpenMP for Chip Multiprocessors (Feng Liu and Vipin Chaudhary).
7.1 Introduction.
7.2 3SoC Architecture Overview.
7.3 The OpenMP Conpiler/Translator.
7.4 Extensions to OpenMP for DSEs.
7.5 Optimization for OpenMP.
7.6 Implementation.
7.7 Performance Evaluation.
7.8 Conclusions.
PART 2. ARCHITECTURAL AND SYSTEM SUPPORT.
8. Compiler and Run-Time Parallelization Techniques for Scientific
Computations on Distributed-Memory Parallel Computers (PeiZong Lee,
Cheien-Min Wang and Jan-Jan Wu).
8.1 Introduction.
8.2 Background Material.
8.3 Compiling Regular Programs on DMPCs.
8.4 Compiler and Run-Time Support for Irregular Programs.
8.5 Library Support for Irregular Applications.
8.6 Related Works.
8.7 Concluding Remarks.
9. Enabling Partial-Cache Line Prefetching Through Data Compression (Youtao
Zhang and Rajiv Gupta).
9.1 Introduction.
9.2 Motivation of Partial Cache-Line Perfetching.
9.3 Cache Design Details.
9.4 Experimental Results.
9.5 Related Work.
9.6 Conclusion.
10. MPI Atomicity and Concurrent Overlapping I/O (Wei-Keng Liao, Alok
Choudhary, Kenin Coloma, Lee Ward, Eric Russell and Neil Pundit).
10.1 Introduction.
10.2 Concurrent Overlapping I/O.
10.3 Implementation Strategies.
10.4 Experiment Results.
10.5 Summary.
11. Code Tiling: One Size Fits All (Jingling Xue and Qingguang Huang).
11.1 Introduction.
11.2 Cache Model.
11.3 Code Tiling.
11.4 Data Tiling.
11.5 Finding Optimal Tile Sizes.
11.6 Experimental Results.
11.7 Related Work.
11.8 Conclusion.
12. Data Conversion for Heterogeneous Migration/Checkpointing (Hai Jiang,
Vipin Chaudhary and John Paul Walters).
12.1 Introduction.
12.2 Migration and Checkpointing.
12.3 Data Conversion.
12.4 Coarse-Grain Tagged RMR in MigThread.
12.5 Microbenchmarks and Experiments.
12.6 Related Work.
12.7 Conclusions and Future Work.
13. Receiving-Message Prediction and Its Speculative Execution (Takanobu
Baba, Takashi Yokota, Kamemitsu Ootsu, Fumihitto Furukawa and Yoshiyuki
Iwamoto).
13.1 Background.
13.2 Receiving-Message Prediction Method.
13.3 Implementation of the Method in the MIPI Libraries.
13.4 Experimental Results.
13.5 Conclusing Remarks.
14. An Investigation of the Applicability of Distributed FPGAs to
High-Performance Computing (John P. Morrison, Padraig O'Dowd and Philip D.
Healy).
14.1 Introduction.
14.2 High Performance Computing with Cluster Computing.
14.3 Reconfigurable Computing with EPGAs.
14.4 DRMC: A Distributed Reconfigurable Metacomputer.
14.5 Algorithms Suited to the Implementation on FPGAs/DRMC.
14.6 Algorithms Not Suited to the Implementation on FPGAs/DRMC.
14.7 Summary.
PART 3. SCHEDULING AND RESOURCE MANAGEMENT.
15. Bandwidth-Aware Resource Allocation for Heterogeneous Computing Systems
to Maximize Throughput (Bo Hong and Viktor K. Prasanna).
15.1 Introduction.
15.2 Related Work.
15.3 Systems Model and Problem Statement.
15.4 Resource Allocation to Maximize System Throughput.
15.5 Experimental Results.
15.6 Conclusion.
16. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
(Christos D. Antonopoulos, Dimitrios S., Nikolopoulos and Theeodore S.
Papatheodorou).
16.1 Introduction.
16.2 Related Work.
16.3 The Implications of Bus Bandwidth for Application Performance.
16.4 Scheduling Policies for Preserving Bus Bandwidth.
16.5 Experimental Evaluation.
16.6 Conclusions.
17. Toward Performance Guarantee of Dynamic Task Scheduling of a
Parameter-Sweep Application onto a Computational Grid (Noriyuki Fujimoto
and Kenichi Hagihara).
17.1 Introduction.
17.2 A Grid Scheduling Model.
17.3 Related Works.
17.4 The Proposed Algorithm RR.
17.5 The Performance Guarantee of the Proposed Algorithm.
17.6 Conclusion.
18. Performance Study of Reliability Maximization and Turnaround
Minimization with GA-based Task Allocation in DCS (Deo Prakash Vidyarthi,
Anil Kumar Tripathi, Biplab Kumer Sarker, Kirti Rani and Laurence T. Yang).
18.1 Introduction.
18.2 GA for Task Allocation.
18.3 The Algorithm.
18.4 Illustrative Examples.
18.5 Discussions and Conclusion.
19. Toward Fast and Efficient Compile-Time Task Scheduling in Heterogeneous
Computing Systems (Tarek Hagras and Jan Janecek).
19.1 Introduction.
19.2 Problem Definition.
19.3 The Suggested Algorithm.
19.4 Heterogeneous Systems Scheduling Heuristics.
19.5 Experimental Results and Discussion.
19.6 Conclusion.
20. An On-Line Approach for Classifying and Extracting Application Behavior
on Linux (Luciano José Senger, Rodrigo Fernandes de Mello, Marcos José
Santana, Regina Helena Carlucci Santana and Laurence Tianruo Yang).
20.1 Introduction.
20.2 Related Work.
20.3 Information Acquisition.
20.4 Linux Process Classification Model.
20.5 Results.
20.6 Evaluation of The Model Intrusion on the System Performance.
20.7 Conclusions.
PART 4. CLUSTERS AND GRID COMPUTING.
21. Peer-to-Peer Grid Computing and a .NET-Based Alchemi Framework (Akshay
Luther, Rajkumar Buyya, Rajiv Ranjan and Srikumar Venugopal).
21.1 Introduction.
21.2 Background.
21.3 Desktop Grid Middleware Considerations.
21.4 Representation Desktop Grid Systems.
21.5 Alchemi Desktop Grid Framework.
21.6 Alchemi Design and Implementation.
21.7 Alchemi Performance Evaluation.
21.8 Summary and Future Work.
22. Global Grids and Software Toolkits: A Study of Four Grid Middleware
Technologies (Parvin Asadzadeh, Rajkumar Buyya, Chun Ling Kei, Deepa Nayar
and Srikumar Venugopal).
22.1 Introduction.
22.2 Overview of Grid Middleware Systems.
22.3 Unicore.
22.4 Globus.
22.5 Legion.
22.6 Gridbus.
22.7 Implementation of UNICORE Adaptor for Gridbus Broker.
22.8 Comparison of Middleware Systems.
22.9 Summary.
23. High-Performance Computing on Clusters: The Distributed JVM Approach
(Wenzhang Zhu, Weijian Fang, Cho-Li Wang and Francis C. M. Lau).
23.1 Background.
23.2 Distributed JVM.
23.3 JESSICA2 Distributed JVM.
23.4 Performance Analysis.
23.5 Related Work.
23.6 Summary.
24. Data Grids: Supporting Data-Intensive Applications in Wide-Area
Networks (Xiao Qin and Hong Jiang).
24.1 Introduction.
24.2 Data Grid Services.
24.3 High-Performance Data Grid.
24.4 Security Issues.
24.5 Open Issues.
24.6 Conclusions.
25. Application I/O on a Parallel File System for Linux Clusters (Dheeraj
Bhaardwaj).
25.1 Introduction.
25.2 Application I/O.
25.3 Parallel I/O System Software.
25.4 Standard Unix & Parallel I/O.
25.5 Example: Seismic Imaging.
25.6 Discussion and Conclusion.
26. One Teraflop Achieved with a Geographically Distributed Linux Cluster
(Peng Wang, George Turner, Steven Simms, Dave Hart, Mary Papakhiam and
Craig Stewart).
26.1 Introduction.
26.2 Hardware and Software Setup.
26.3 System Tuning and Benchmark Results.
26.4 Performance Costs and Benefits.
27. A Grid-Based Distributed Simulation of Plasma Turbulence (Beniamino Di
Martino, Salvatore Venticinque, Sergio Criguglio, Giulana Fogaccia and
Gregorio Vlad).
27.1 Introduction.
27.2 MPI Implementation of The Internode Domain Decomposition.
27.3 Integration of The Internode Domain Decomposition with Intranode
Particle Decomposition Strategies.
27.4 The MPICH-G2 Implementation.
27.5 Conclusions.
28. Evidence-Aware Trust Model for Dynamic Services (Ali Shaikh Ali, Omer
F. Rana and Rashid J. Al-Ali).
28.1 Motivation For Evaluating Trust.
28.2 Service Trust-What Is It?
28.3 Evidence-Aware Trust Model.
28.4 The System Life Cycle.
28.5 Conclusion.
PART 5. PEER-TO-PEER COMPUTING.
29. Resource Discovery in Peer-to-Peer Infrastructures (Huang-Chang Hsiao
and Chung-Ta King).
29.1 Introduction.
29.2 Design Requirements.
29.3 Unstructured P2P Systems 4.
29.4 Structured P2P Systems.
29.5 Advanced Resource Discovery for Structured P2P Systems.
29.6 Summary.
30. Hybrid Periodical Flooding in Unstructured Peer-to-Peer Networks
(Yunhao Liu, Li Xiao, Lionel M. Ni and Zhenyun Zhuang).
30.1 Introduction.
30.2 Serarch Mechanisms.
30.3 Hybrid Periodical Flooding.
30.4 Simulation Methodology.
30.5 Performance Evaluation.
30.6 Conclusion.
31. HIERAS: A DHT-Based Hierarchical P2P Routing Algorithm (Zhiyong Xu,
Yiming Hu and Laxmi Bhuyan).
31.1 Introduction.
31.2 Hierarchical P2P Architecture.
31.3 System Design.
31.4 Performance Evaluation.
31.5 Related Works.
31.6 Summary.
32. Flexible and Scalable Group Communication Model for Peer-to-Peer
Systems (Tomoya Enokido and Makoto Takizawa).
32.1 Introduction.
32.2 Group of Agents.
32.3 Functions of Group Protocol.
32.4 Autonomic Group Protocol.
32.5 Retransmission.
32.6 Conclusion.
PART 6. WIRELESS AND MOBILE COMPUTING.
33. Study of Cache-Enhanced Dynamic Movement-Based Location Management
Schemes for 3G Cellular Networks (Krishna Priya Patury, Yi Pan, Xiaola Lin,
Yang Xiao and Jie Li).
33. 1 Introduction.
33.2 Location Management with and without Cache.
33.3 The Cache-Enhanced Location Management Scheme.
33.4 Simulation Results and Analysis.
33.5 Conclusion.
34. Maximizing Multicast Lifetime in Wireless Ad Hoc Networks (Guofeng Deng
and Sandeep K. S. Gupta).
34.1 Introduction.
34.2 Energy Consumption Model In WANETs.
34.3 Definitions of Maximum Multicast Lifetime.
34.4 Maximum Multicast Lifetime of The Network Using Single Tree (MMLM).
34.5 Maximum Multicast Lifetime of The Network Using Multiple Trees (MMLM).
34.6 Summary.
35. A QoS-Aware Scheduling Algorithm for Bluetooth Scatternets (Young Man
Kim, Ten H. Lai and Anish Arora).
35.1 Introduction.
35.2 Perfect Scheduling Problem for Bipartite Scatternet.
35.3 Perfect Assignment Scheduling Algorithm for Bipartite Scatternets.
35.4 Distributed, Local, and Incremental Scheduling Algorithms.
35.5 Performance and QOS Analysis.
35.6 Conclusion.
PART 7. HIGH PERFORMANCE APPLICATIONS.
36. A Workload Partitioner for Heterogeneous Grids (Daniel J. Harvey, Sajal
K. Das and Rupak Biswas).
36.1 Introduction.
36.2 Preliminaries.
36.3 The MinEX Partitioner.
36.4 N-Body Application.
36.5 Experimental Study.
36.6 Conclusion.
37. Building a User-Level Grid for Bag-of-Tasks Applications (Walfredo
Cirne, Francisco Brasileiro, Daniel Paranhos, Lauro Costa, Elizeu
Santos-Neto and Carla Osthoff).
37.1 Introduction.
37.2 Design Goals.
37.3 Architecture.
37.4 Working Environment.
37.5 Scheduling.
37.6 Implementation.
37.7 Performance Evaluation.
37.8 Conclusions and Future Work.
38. An Efficient Parallel Method for Calculating the Smarandache Function
(Sabin Tabirca, Tatiana Tabirca, Kieran Reynolds and Laurence T. Yang).
38.1 Introduction.
38.2 Computing in Parallel.
38.3 Experimental Results.
38.4 Conclusion.
39. Design, Implementation and Deployment of a Commodity Cluster for
Peirodic Comparison of Gene Sequences (Anita M. Orendt, Brian Haymore,
David Richardson, Sofia Robb, Alejandro Sanchez Alvarado and Julio C.
Facelli).
39.1 Introduction.
39.2 System Requirements and Design.
39.3 Performance.
39.4 Conclusions.
40. A Hierarchical Distributed Shared-Memory Parallel Branch & Bound
Application with PVM and OpenMP on Multiprocessor Clusters (Rocco Aversa,
Beniamino Di Martino, Nicola Mazzocca and Salvatore Venticinque).
40.1 Introduction.
40.2 The B&B Parallel Application.
40.3 The OpenMP Extension.
40.4 Experimental Results.
40.5 Conclusions.
41. IP Based Telecommunication Services (Anna Bonifacio and G. Spinillo).
41.1 Introduction.
Index.
Contributors.
PART 1. PROGRAMMING MODEL.
1. ClusterGOP: A High-Level Programming Environment for Clusters (Fan Chan,
Jiannong Cao and Minyi Guo).
1.1 Introduction.
1.2 GOP Model and ClusterGOP Architecture.
1.3 VisualGOP.
1.4 The ClusterGOP Library.
1.5 MPMD Programming Support.
1.6 Programming Using ClusterGOP.
1.7 Summary.
2. The Challenge of Providing A High-Level Programming Model for
High-Performance Computing (Barbara Chapman).
2.1 Introduction.
2.2 HPC Architectures.
2.3 HPC Programming Models: The First Generation.
2.4 The Second generation of HPC Programming Models.
2.5 OpenMP for DMPs.
2.6 Experiments with OpenMP on DMPs.
2.7 Conclusions.
3. SAT: Toward Structured Parallelism Using Skeletons (Sergei Gorlatch).
3.1 Introduction.
3.2 SAT: A Methodology Outline.
3.3 Skeletons and Collective Operations.
3.4 Case Study: Maximum Segment SUM (MSS).
3.5 Performance Aspect in SAT.
3.6 Conclusions and Related Work.
4. Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance
Computing (Alexander Tiskin).
4.1 The BSP Model.
4.2 BSP Programming.
4.3 Conclusions.
5. Cilk Versus MPI: Comparing Two Parallel Programming Styles on
Heterogenous Systems (John Morris, KyuHo Lee and JunSeong Kim).
5.1 Introduction.
5.2 Experiments.
5.3 Results.
5.4 Conclusion.
6. Nested Parallelism and Pipelining in OpenMP (Marc Gonzalez, E. Ayguade,
X. Martorell and J. Labarta).
6.1 Introduction.
6.2 OpenMP Extensions for Nested Parallelism.
6.3 OpenMP Extensions for Thread Synchronization.
6.4 Summary.
7. OpenMP for Chip Multiprocessors (Feng Liu and Vipin Chaudhary).
7.1 Introduction.
7.2 3SoC Architecture Overview.
7.3 The OpenMP Conpiler/Translator.
7.4 Extensions to OpenMP for DSEs.
7.5 Optimization for OpenMP.
7.6 Implementation.
7.7 Performance Evaluation.
7.8 Conclusions.
PART 2. ARCHITECTURAL AND SYSTEM SUPPORT.
8. Compiler and Run-Time Parallelization Techniques for Scientific
Computations on Distributed-Memory Parallel Computers (PeiZong Lee,
Cheien-Min Wang and Jan-Jan Wu).
8.1 Introduction.
8.2 Background Material.
8.3 Compiling Regular Programs on DMPCs.
8.4 Compiler and Run-Time Support for Irregular Programs.
8.5 Library Support for Irregular Applications.
8.6 Related Works.
8.7 Concluding Remarks.
9. Enabling Partial-Cache Line Prefetching Through Data Compression (Youtao
Zhang and Rajiv Gupta).
9.1 Introduction.
9.2 Motivation of Partial Cache-Line Perfetching.
9.3 Cache Design Details.
9.4 Experimental Results.
9.5 Related Work.
9.6 Conclusion.
10. MPI Atomicity and Concurrent Overlapping I/O (Wei-Keng Liao, Alok
Choudhary, Kenin Coloma, Lee Ward, Eric Russell and Neil Pundit).
10.1 Introduction.
10.2 Concurrent Overlapping I/O.
10.3 Implementation Strategies.
10.4 Experiment Results.
10.5 Summary.
11. Code Tiling: One Size Fits All (Jingling Xue and Qingguang Huang).
11.1 Introduction.
11.2 Cache Model.
11.3 Code Tiling.
11.4 Data Tiling.
11.5 Finding Optimal Tile Sizes.
11.6 Experimental Results.
11.7 Related Work.
11.8 Conclusion.
12. Data Conversion for Heterogeneous Migration/Checkpointing (Hai Jiang,
Vipin Chaudhary and John Paul Walters).
12.1 Introduction.
12.2 Migration and Checkpointing.
12.3 Data Conversion.
12.4 Coarse-Grain Tagged RMR in MigThread.
12.5 Microbenchmarks and Experiments.
12.6 Related Work.
12.7 Conclusions and Future Work.
13. Receiving-Message Prediction and Its Speculative Execution (Takanobu
Baba, Takashi Yokota, Kamemitsu Ootsu, Fumihitto Furukawa and Yoshiyuki
Iwamoto).
13.1 Background.
13.2 Receiving-Message Prediction Method.
13.3 Implementation of the Method in the MIPI Libraries.
13.4 Experimental Results.
13.5 Conclusing Remarks.
14. An Investigation of the Applicability of Distributed FPGAs to
High-Performance Computing (John P. Morrison, Padraig O'Dowd and Philip D.
Healy).
14.1 Introduction.
14.2 High Performance Computing with Cluster Computing.
14.3 Reconfigurable Computing with EPGAs.
14.4 DRMC: A Distributed Reconfigurable Metacomputer.
14.5 Algorithms Suited to the Implementation on FPGAs/DRMC.
14.6 Algorithms Not Suited to the Implementation on FPGAs/DRMC.
14.7 Summary.
PART 3. SCHEDULING AND RESOURCE MANAGEMENT.
15. Bandwidth-Aware Resource Allocation for Heterogeneous Computing Systems
to Maximize Throughput (Bo Hong and Viktor K. Prasanna).
15.1 Introduction.
15.2 Related Work.
15.3 Systems Model and Problem Statement.
15.4 Resource Allocation to Maximize System Throughput.
15.5 Experimental Results.
15.6 Conclusion.
16. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
(Christos D. Antonopoulos, Dimitrios S., Nikolopoulos and Theeodore S.
Papatheodorou).
16.1 Introduction.
16.2 Related Work.
16.3 The Implications of Bus Bandwidth for Application Performance.
16.4 Scheduling Policies for Preserving Bus Bandwidth.
16.5 Experimental Evaluation.
16.6 Conclusions.
17. Toward Performance Guarantee of Dynamic Task Scheduling of a
Parameter-Sweep Application onto a Computational Grid (Noriyuki Fujimoto
and Kenichi Hagihara).
17.1 Introduction.
17.2 A Grid Scheduling Model.
17.3 Related Works.
17.4 The Proposed Algorithm RR.
17.5 The Performance Guarantee of the Proposed Algorithm.
17.6 Conclusion.
18. Performance Study of Reliability Maximization and Turnaround
Minimization with GA-based Task Allocation in DCS (Deo Prakash Vidyarthi,
Anil Kumar Tripathi, Biplab Kumer Sarker, Kirti Rani and Laurence T. Yang).
18.1 Introduction.
18.2 GA for Task Allocation.
18.3 The Algorithm.
18.4 Illustrative Examples.
18.5 Discussions and Conclusion.
19. Toward Fast and Efficient Compile-Time Task Scheduling in Heterogeneous
Computing Systems (Tarek Hagras and Jan Janecek).
19.1 Introduction.
19.2 Problem Definition.
19.3 The Suggested Algorithm.
19.4 Heterogeneous Systems Scheduling Heuristics.
19.5 Experimental Results and Discussion.
19.6 Conclusion.
20. An On-Line Approach for Classifying and Extracting Application Behavior
on Linux (Luciano José Senger, Rodrigo Fernandes de Mello, Marcos José
Santana, Regina Helena Carlucci Santana and Laurence Tianruo Yang).
20.1 Introduction.
20.2 Related Work.
20.3 Information Acquisition.
20.4 Linux Process Classification Model.
20.5 Results.
20.6 Evaluation of The Model Intrusion on the System Performance.
20.7 Conclusions.
PART 4. CLUSTERS AND GRID COMPUTING.
21. Peer-to-Peer Grid Computing and a .NET-Based Alchemi Framework (Akshay
Luther, Rajkumar Buyya, Rajiv Ranjan and Srikumar Venugopal).
21.1 Introduction.
21.2 Background.
21.3 Desktop Grid Middleware Considerations.
21.4 Representation Desktop Grid Systems.
21.5 Alchemi Desktop Grid Framework.
21.6 Alchemi Design and Implementation.
21.7 Alchemi Performance Evaluation.
21.8 Summary and Future Work.
22. Global Grids and Software Toolkits: A Study of Four Grid Middleware
Technologies (Parvin Asadzadeh, Rajkumar Buyya, Chun Ling Kei, Deepa Nayar
and Srikumar Venugopal).
22.1 Introduction.
22.2 Overview of Grid Middleware Systems.
22.3 Unicore.
22.4 Globus.
22.5 Legion.
22.6 Gridbus.
22.7 Implementation of UNICORE Adaptor for Gridbus Broker.
22.8 Comparison of Middleware Systems.
22.9 Summary.
23. High-Performance Computing on Clusters: The Distributed JVM Approach
(Wenzhang Zhu, Weijian Fang, Cho-Li Wang and Francis C. M. Lau).
23.1 Background.
23.2 Distributed JVM.
23.3 JESSICA2 Distributed JVM.
23.4 Performance Analysis.
23.5 Related Work.
23.6 Summary.
24. Data Grids: Supporting Data-Intensive Applications in Wide-Area
Networks (Xiao Qin and Hong Jiang).
24.1 Introduction.
24.2 Data Grid Services.
24.3 High-Performance Data Grid.
24.4 Security Issues.
24.5 Open Issues.
24.6 Conclusions.
25. Application I/O on a Parallel File System for Linux Clusters (Dheeraj
Bhaardwaj).
25.1 Introduction.
25.2 Application I/O.
25.3 Parallel I/O System Software.
25.4 Standard Unix & Parallel I/O.
25.5 Example: Seismic Imaging.
25.6 Discussion and Conclusion.
26. One Teraflop Achieved with a Geographically Distributed Linux Cluster
(Peng Wang, George Turner, Steven Simms, Dave Hart, Mary Papakhiam and
Craig Stewart).
26.1 Introduction.
26.2 Hardware and Software Setup.
26.3 System Tuning and Benchmark Results.
26.4 Performance Costs and Benefits.
27. A Grid-Based Distributed Simulation of Plasma Turbulence (Beniamino Di
Martino, Salvatore Venticinque, Sergio Criguglio, Giulana Fogaccia and
Gregorio Vlad).
27.1 Introduction.
27.2 MPI Implementation of The Internode Domain Decomposition.
27.3 Integration of The Internode Domain Decomposition with Intranode
Particle Decomposition Strategies.
27.4 The MPICH-G2 Implementation.
27.5 Conclusions.
28. Evidence-Aware Trust Model for Dynamic Services (Ali Shaikh Ali, Omer
F. Rana and Rashid J. Al-Ali).
28.1 Motivation For Evaluating Trust.
28.2 Service Trust-What Is It?
28.3 Evidence-Aware Trust Model.
28.4 The System Life Cycle.
28.5 Conclusion.
PART 5. PEER-TO-PEER COMPUTING.
29. Resource Discovery in Peer-to-Peer Infrastructures (Huang-Chang Hsiao
and Chung-Ta King).
29.1 Introduction.
29.2 Design Requirements.
29.3 Unstructured P2P Systems 4.
29.4 Structured P2P Systems.
29.5 Advanced Resource Discovery for Structured P2P Systems.
29.6 Summary.
30. Hybrid Periodical Flooding in Unstructured Peer-to-Peer Networks
(Yunhao Liu, Li Xiao, Lionel M. Ni and Zhenyun Zhuang).
30.1 Introduction.
30.2 Serarch Mechanisms.
30.3 Hybrid Periodical Flooding.
30.4 Simulation Methodology.
30.5 Performance Evaluation.
30.6 Conclusion.
31. HIERAS: A DHT-Based Hierarchical P2P Routing Algorithm (Zhiyong Xu,
Yiming Hu and Laxmi Bhuyan).
31.1 Introduction.
31.2 Hierarchical P2P Architecture.
31.3 System Design.
31.4 Performance Evaluation.
31.5 Related Works.
31.6 Summary.
32. Flexible and Scalable Group Communication Model for Peer-to-Peer
Systems (Tomoya Enokido and Makoto Takizawa).
32.1 Introduction.
32.2 Group of Agents.
32.3 Functions of Group Protocol.
32.4 Autonomic Group Protocol.
32.5 Retransmission.
32.6 Conclusion.
PART 6. WIRELESS AND MOBILE COMPUTING.
33. Study of Cache-Enhanced Dynamic Movement-Based Location Management
Schemes for 3G Cellular Networks (Krishna Priya Patury, Yi Pan, Xiaola Lin,
Yang Xiao and Jie Li).
33. 1 Introduction.
33.2 Location Management with and without Cache.
33.3 The Cache-Enhanced Location Management Scheme.
33.4 Simulation Results and Analysis.
33.5 Conclusion.
34. Maximizing Multicast Lifetime in Wireless Ad Hoc Networks (Guofeng Deng
and Sandeep K. S. Gupta).
34.1 Introduction.
34.2 Energy Consumption Model In WANETs.
34.3 Definitions of Maximum Multicast Lifetime.
34.4 Maximum Multicast Lifetime of The Network Using Single Tree (MMLM).
34.5 Maximum Multicast Lifetime of The Network Using Multiple Trees (MMLM).
34.6 Summary.
35. A QoS-Aware Scheduling Algorithm for Bluetooth Scatternets (Young Man
Kim, Ten H. Lai and Anish Arora).
35.1 Introduction.
35.2 Perfect Scheduling Problem for Bipartite Scatternet.
35.3 Perfect Assignment Scheduling Algorithm for Bipartite Scatternets.
35.4 Distributed, Local, and Incremental Scheduling Algorithms.
35.5 Performance and QOS Analysis.
35.6 Conclusion.
PART 7. HIGH PERFORMANCE APPLICATIONS.
36. A Workload Partitioner for Heterogeneous Grids (Daniel J. Harvey, Sajal
K. Das and Rupak Biswas).
36.1 Introduction.
36.2 Preliminaries.
36.3 The MinEX Partitioner.
36.4 N-Body Application.
36.5 Experimental Study.
36.6 Conclusion.
37. Building a User-Level Grid for Bag-of-Tasks Applications (Walfredo
Cirne, Francisco Brasileiro, Daniel Paranhos, Lauro Costa, Elizeu
Santos-Neto and Carla Osthoff).
37.1 Introduction.
37.2 Design Goals.
37.3 Architecture.
37.4 Working Environment.
37.5 Scheduling.
37.6 Implementation.
37.7 Performance Evaluation.
37.8 Conclusions and Future Work.
38. An Efficient Parallel Method for Calculating the Smarandache Function
(Sabin Tabirca, Tatiana Tabirca, Kieran Reynolds and Laurence T. Yang).
38.1 Introduction.
38.2 Computing in Parallel.
38.3 Experimental Results.
38.4 Conclusion.
39. Design, Implementation and Deployment of a Commodity Cluster for
Peirodic Comparison of Gene Sequences (Anita M. Orendt, Brian Haymore,
David Richardson, Sofia Robb, Alejandro Sanchez Alvarado and Julio C.
Facelli).
39.1 Introduction.
39.2 System Requirements and Design.
39.3 Performance.
39.4 Conclusions.
40. A Hierarchical Distributed Shared-Memory Parallel Branch & Bound
Application with PVM and OpenMP on Multiprocessor Clusters (Rocco Aversa,
Beniamino Di Martino, Nicola Mazzocca and Salvatore Venticinque).
40.1 Introduction.
40.2 The B&B Parallel Application.
40.3 The OpenMP Extension.
40.4 Experimental Results.
40.5 Conclusions.
41. IP Based Telecommunication Services (Anna Bonifacio and G. Spinillo).
41.1 Introduction.
Index.
Preface.
Contributors.
PART 1. PROGRAMMING MODEL.
1. ClusterGOP: A High-Level Programming Environment for Clusters (Fan Chan,
Jiannong Cao and Minyi Guo).
1.1 Introduction.
1.2 GOP Model and ClusterGOP Architecture.
1.3 VisualGOP.
1.4 The ClusterGOP Library.
1.5 MPMD Programming Support.
1.6 Programming Using ClusterGOP.
1.7 Summary.
2. The Challenge of Providing A High-Level Programming Model for
High-Performance Computing (Barbara Chapman).
2.1 Introduction.
2.2 HPC Architectures.
2.3 HPC Programming Models: The First Generation.
2.4 The Second generation of HPC Programming Models.
2.5 OpenMP for DMPs.
2.6 Experiments with OpenMP on DMPs.
2.7 Conclusions.
3. SAT: Toward Structured Parallelism Using Skeletons (Sergei Gorlatch).
3.1 Introduction.
3.2 SAT: A Methodology Outline.
3.3 Skeletons and Collective Operations.
3.4 Case Study: Maximum Segment SUM (MSS).
3.5 Performance Aspect in SAT.
3.6 Conclusions and Related Work.
4. Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance
Computing (Alexander Tiskin).
4.1 The BSP Model.
4.2 BSP Programming.
4.3 Conclusions.
5. Cilk Versus MPI: Comparing Two Parallel Programming Styles on
Heterogenous Systems (John Morris, KyuHo Lee and JunSeong Kim).
5.1 Introduction.
5.2 Experiments.
5.3 Results.
5.4 Conclusion.
6. Nested Parallelism and Pipelining in OpenMP (Marc Gonzalez, E. Ayguade,
X. Martorell and J. Labarta).
6.1 Introduction.
6.2 OpenMP Extensions for Nested Parallelism.
6.3 OpenMP Extensions for Thread Synchronization.
6.4 Summary.
7. OpenMP for Chip Multiprocessors (Feng Liu and Vipin Chaudhary).
7.1 Introduction.
7.2 3SoC Architecture Overview.
7.3 The OpenMP Conpiler/Translator.
7.4 Extensions to OpenMP for DSEs.
7.5 Optimization for OpenMP.
7.6 Implementation.
7.7 Performance Evaluation.
7.8 Conclusions.
PART 2. ARCHITECTURAL AND SYSTEM SUPPORT.
8. Compiler and Run-Time Parallelization Techniques for Scientific
Computations on Distributed-Memory Parallel Computers (PeiZong Lee,
Cheien-Min Wang and Jan-Jan Wu).
8.1 Introduction.
8.2 Background Material.
8.3 Compiling Regular Programs on DMPCs.
8.4 Compiler and Run-Time Support for Irregular Programs.
8.5 Library Support for Irregular Applications.
8.6 Related Works.
8.7 Concluding Remarks.
9. Enabling Partial-Cache Line Prefetching Through Data Compression (Youtao
Zhang and Rajiv Gupta).
9.1 Introduction.
9.2 Motivation of Partial Cache-Line Perfetching.
9.3 Cache Design Details.
9.4 Experimental Results.
9.5 Related Work.
9.6 Conclusion.
10. MPI Atomicity and Concurrent Overlapping I/O (Wei-Keng Liao, Alok
Choudhary, Kenin Coloma, Lee Ward, Eric Russell and Neil Pundit).
10.1 Introduction.
10.2 Concurrent Overlapping I/O.
10.3 Implementation Strategies.
10.4 Experiment Results.
10.5 Summary.
11. Code Tiling: One Size Fits All (Jingling Xue and Qingguang Huang).
11.1 Introduction.
11.2 Cache Model.
11.3 Code Tiling.
11.4 Data Tiling.
11.5 Finding Optimal Tile Sizes.
11.6 Experimental Results.
11.7 Related Work.
11.8 Conclusion.
12. Data Conversion for Heterogeneous Migration/Checkpointing (Hai Jiang,
Vipin Chaudhary and John Paul Walters).
12.1 Introduction.
12.2 Migration and Checkpointing.
12.3 Data Conversion.
12.4 Coarse-Grain Tagged RMR in MigThread.
12.5 Microbenchmarks and Experiments.
12.6 Related Work.
12.7 Conclusions and Future Work.
13. Receiving-Message Prediction and Its Speculative Execution (Takanobu
Baba, Takashi Yokota, Kamemitsu Ootsu, Fumihitto Furukawa and Yoshiyuki
Iwamoto).
13.1 Background.
13.2 Receiving-Message Prediction Method.
13.3 Implementation of the Method in the MIPI Libraries.
13.4 Experimental Results.
13.5 Conclusing Remarks.
14. An Investigation of the Applicability of Distributed FPGAs to
High-Performance Computing (John P. Morrison, Padraig O'Dowd and Philip D.
Healy).
14.1 Introduction.
14.2 High Performance Computing with Cluster Computing.
14.3 Reconfigurable Computing with EPGAs.
14.4 DRMC: A Distributed Reconfigurable Metacomputer.
14.5 Algorithms Suited to the Implementation on FPGAs/DRMC.
14.6 Algorithms Not Suited to the Implementation on FPGAs/DRMC.
14.7 Summary.
PART 3. SCHEDULING AND RESOURCE MANAGEMENT.
15. Bandwidth-Aware Resource Allocation for Heterogeneous Computing Systems
to Maximize Throughput (Bo Hong and Viktor K. Prasanna).
15.1 Introduction.
15.2 Related Work.
15.3 Systems Model and Problem Statement.
15.4 Resource Allocation to Maximize System Throughput.
15.5 Experimental Results.
15.6 Conclusion.
16. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
(Christos D. Antonopoulos, Dimitrios S., Nikolopoulos and Theeodore S.
Papatheodorou).
16.1 Introduction.
16.2 Related Work.
16.3 The Implications of Bus Bandwidth for Application Performance.
16.4 Scheduling Policies for Preserving Bus Bandwidth.
16.5 Experimental Evaluation.
16.6 Conclusions.
17. Toward Performance Guarantee of Dynamic Task Scheduling of a
Parameter-Sweep Application onto a Computational Grid (Noriyuki Fujimoto
and Kenichi Hagihara).
17.1 Introduction.
17.2 A Grid Scheduling Model.
17.3 Related Works.
17.4 The Proposed Algorithm RR.
17.5 The Performance Guarantee of the Proposed Algorithm.
17.6 Conclusion.
18. Performance Study of Reliability Maximization and Turnaround
Minimization with GA-based Task Allocation in DCS (Deo Prakash Vidyarthi,
Anil Kumar Tripathi, Biplab Kumer Sarker, Kirti Rani and Laurence T. Yang).
18.1 Introduction.
18.2 GA for Task Allocation.
18.3 The Algorithm.
18.4 Illustrative Examples.
18.5 Discussions and Conclusion.
19. Toward Fast and Efficient Compile-Time Task Scheduling in Heterogeneous
Computing Systems (Tarek Hagras and Jan Janecek).
19.1 Introduction.
19.2 Problem Definition.
19.3 The Suggested Algorithm.
19.4 Heterogeneous Systems Scheduling Heuristics.
19.5 Experimental Results and Discussion.
19.6 Conclusion.
20. An On-Line Approach for Classifying and Extracting Application Behavior
on Linux (Luciano José Senger, Rodrigo Fernandes de Mello, Marcos José
Santana, Regina Helena Carlucci Santana and Laurence Tianruo Yang).
20.1 Introduction.
20.2 Related Work.
20.3 Information Acquisition.
20.4 Linux Process Classification Model.
20.5 Results.
20.6 Evaluation of The Model Intrusion on the System Performance.
20.7 Conclusions.
PART 4. CLUSTERS AND GRID COMPUTING.
21. Peer-to-Peer Grid Computing and a .NET-Based Alchemi Framework (Akshay
Luther, Rajkumar Buyya, Rajiv Ranjan and Srikumar Venugopal).
21.1 Introduction.
21.2 Background.
21.3 Desktop Grid Middleware Considerations.
21.4 Representation Desktop Grid Systems.
21.5 Alchemi Desktop Grid Framework.
21.6 Alchemi Design and Implementation.
21.7 Alchemi Performance Evaluation.
21.8 Summary and Future Work.
22. Global Grids and Software Toolkits: A Study of Four Grid Middleware
Technologies (Parvin Asadzadeh, Rajkumar Buyya, Chun Ling Kei, Deepa Nayar
and Srikumar Venugopal).
22.1 Introduction.
22.2 Overview of Grid Middleware Systems.
22.3 Unicore.
22.4 Globus.
22.5 Legion.
22.6 Gridbus.
22.7 Implementation of UNICORE Adaptor for Gridbus Broker.
22.8 Comparison of Middleware Systems.
22.9 Summary.
23. High-Performance Computing on Clusters: The Distributed JVM Approach
(Wenzhang Zhu, Weijian Fang, Cho-Li Wang and Francis C. M. Lau).
23.1 Background.
23.2 Distributed JVM.
23.3 JESSICA2 Distributed JVM.
23.4 Performance Analysis.
23.5 Related Work.
23.6 Summary.
24. Data Grids: Supporting Data-Intensive Applications in Wide-Area
Networks (Xiao Qin and Hong Jiang).
24.1 Introduction.
24.2 Data Grid Services.
24.3 High-Performance Data Grid.
24.4 Security Issues.
24.5 Open Issues.
24.6 Conclusions.
25. Application I/O on a Parallel File System for Linux Clusters (Dheeraj
Bhaardwaj).
25.1 Introduction.
25.2 Application I/O.
25.3 Parallel I/O System Software.
25.4 Standard Unix & Parallel I/O.
25.5 Example: Seismic Imaging.
25.6 Discussion and Conclusion.
26. One Teraflop Achieved with a Geographically Distributed Linux Cluster
(Peng Wang, George Turner, Steven Simms, Dave Hart, Mary Papakhiam and
Craig Stewart).
26.1 Introduction.
26.2 Hardware and Software Setup.
26.3 System Tuning and Benchmark Results.
26.4 Performance Costs and Benefits.
27. A Grid-Based Distributed Simulation of Plasma Turbulence (Beniamino Di
Martino, Salvatore Venticinque, Sergio Criguglio, Giulana Fogaccia and
Gregorio Vlad).
27.1 Introduction.
27.2 MPI Implementation of The Internode Domain Decomposition.
27.3 Integration of The Internode Domain Decomposition with Intranode
Particle Decomposition Strategies.
27.4 The MPICH-G2 Implementation.
27.5 Conclusions.
28. Evidence-Aware Trust Model for Dynamic Services (Ali Shaikh Ali, Omer
F. Rana and Rashid J. Al-Ali).
28.1 Motivation For Evaluating Trust.
28.2 Service Trust-What Is It?
28.3 Evidence-Aware Trust Model.
28.4 The System Life Cycle.
28.5 Conclusion.
PART 5. PEER-TO-PEER COMPUTING.
29. Resource Discovery in Peer-to-Peer Infrastructures (Huang-Chang Hsiao
and Chung-Ta King).
29.1 Introduction.
29.2 Design Requirements.
29.3 Unstructured P2P Systems 4.
29.4 Structured P2P Systems.
29.5 Advanced Resource Discovery for Structured P2P Systems.
29.6 Summary.
30. Hybrid Periodical Flooding in Unstructured Peer-to-Peer Networks
(Yunhao Liu, Li Xiao, Lionel M. Ni and Zhenyun Zhuang).
30.1 Introduction.
30.2 Serarch Mechanisms.
30.3 Hybrid Periodical Flooding.
30.4 Simulation Methodology.
30.5 Performance Evaluation.
30.6 Conclusion.
31. HIERAS: A DHT-Based Hierarchical P2P Routing Algorithm (Zhiyong Xu,
Yiming Hu and Laxmi Bhuyan).
31.1 Introduction.
31.2 Hierarchical P2P Architecture.
31.3 System Design.
31.4 Performance Evaluation.
31.5 Related Works.
31.6 Summary.
32. Flexible and Scalable Group Communication Model for Peer-to-Peer
Systems (Tomoya Enokido and Makoto Takizawa).
32.1 Introduction.
32.2 Group of Agents.
32.3 Functions of Group Protocol.
32.4 Autonomic Group Protocol.
32.5 Retransmission.
32.6 Conclusion.
PART 6. WIRELESS AND MOBILE COMPUTING.
33. Study of Cache-Enhanced Dynamic Movement-Based Location Management
Schemes for 3G Cellular Networks (Krishna Priya Patury, Yi Pan, Xiaola Lin,
Yang Xiao and Jie Li).
33. 1 Introduction.
33.2 Location Management with and without Cache.
33.3 The Cache-Enhanced Location Management Scheme.
33.4 Simulation Results and Analysis.
33.5 Conclusion.
34. Maximizing Multicast Lifetime in Wireless Ad Hoc Networks (Guofeng Deng
and Sandeep K. S. Gupta).
34.1 Introduction.
34.2 Energy Consumption Model In WANETs.
34.3 Definitions of Maximum Multicast Lifetime.
34.4 Maximum Multicast Lifetime of The Network Using Single Tree (MMLM).
34.5 Maximum Multicast Lifetime of The Network Using Multiple Trees (MMLM).
34.6 Summary.
35. A QoS-Aware Scheduling Algorithm for Bluetooth Scatternets (Young Man
Kim, Ten H. Lai and Anish Arora).
35.1 Introduction.
35.2 Perfect Scheduling Problem for Bipartite Scatternet.
35.3 Perfect Assignment Scheduling Algorithm for Bipartite Scatternets.
35.4 Distributed, Local, and Incremental Scheduling Algorithms.
35.5 Performance and QOS Analysis.
35.6 Conclusion.
PART 7. HIGH PERFORMANCE APPLICATIONS.
36. A Workload Partitioner for Heterogeneous Grids (Daniel J. Harvey, Sajal
K. Das and Rupak Biswas).
36.1 Introduction.
36.2 Preliminaries.
36.3 The MinEX Partitioner.
36.4 N-Body Application.
36.5 Experimental Study.
36.6 Conclusion.
37. Building a User-Level Grid for Bag-of-Tasks Applications (Walfredo
Cirne, Francisco Brasileiro, Daniel Paranhos, Lauro Costa, Elizeu
Santos-Neto and Carla Osthoff).
37.1 Introduction.
37.2 Design Goals.
37.3 Architecture.
37.4 Working Environment.
37.5 Scheduling.
37.6 Implementation.
37.7 Performance Evaluation.
37.8 Conclusions and Future Work.
38. An Efficient Parallel Method for Calculating the Smarandache Function
(Sabin Tabirca, Tatiana Tabirca, Kieran Reynolds and Laurence T. Yang).
38.1 Introduction.
38.2 Computing in Parallel.
38.3 Experimental Results.
38.4 Conclusion.
39. Design, Implementation and Deployment of a Commodity Cluster for
Peirodic Comparison of Gene Sequences (Anita M. Orendt, Brian Haymore,
David Richardson, Sofia Robb, Alejandro Sanchez Alvarado and Julio C.
Facelli).
39.1 Introduction.
39.2 System Requirements and Design.
39.3 Performance.
39.4 Conclusions.
40. A Hierarchical Distributed Shared-Memory Parallel Branch & Bound
Application with PVM and OpenMP on Multiprocessor Clusters (Rocco Aversa,
Beniamino Di Martino, Nicola Mazzocca and Salvatore Venticinque).
40.1 Introduction.
40.2 The B&B Parallel Application.
40.3 The OpenMP Extension.
40.4 Experimental Results.
40.5 Conclusions.
41. IP Based Telecommunication Services (Anna Bonifacio and G. Spinillo).
41.1 Introduction.
Index.
Contributors.
PART 1. PROGRAMMING MODEL.
1. ClusterGOP: A High-Level Programming Environment for Clusters (Fan Chan,
Jiannong Cao and Minyi Guo).
1.1 Introduction.
1.2 GOP Model and ClusterGOP Architecture.
1.3 VisualGOP.
1.4 The ClusterGOP Library.
1.5 MPMD Programming Support.
1.6 Programming Using ClusterGOP.
1.7 Summary.
2. The Challenge of Providing A High-Level Programming Model for
High-Performance Computing (Barbara Chapman).
2.1 Introduction.
2.2 HPC Architectures.
2.3 HPC Programming Models: The First Generation.
2.4 The Second generation of HPC Programming Models.
2.5 OpenMP for DMPs.
2.6 Experiments with OpenMP on DMPs.
2.7 Conclusions.
3. SAT: Toward Structured Parallelism Using Skeletons (Sergei Gorlatch).
3.1 Introduction.
3.2 SAT: A Methodology Outline.
3.3 Skeletons and Collective Operations.
3.4 Case Study: Maximum Segment SUM (MSS).
3.5 Performance Aspect in SAT.
3.6 Conclusions and Related Work.
4. Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance
Computing (Alexander Tiskin).
4.1 The BSP Model.
4.2 BSP Programming.
4.3 Conclusions.
5. Cilk Versus MPI: Comparing Two Parallel Programming Styles on
Heterogenous Systems (John Morris, KyuHo Lee and JunSeong Kim).
5.1 Introduction.
5.2 Experiments.
5.3 Results.
5.4 Conclusion.
6. Nested Parallelism and Pipelining in OpenMP (Marc Gonzalez, E. Ayguade,
X. Martorell and J. Labarta).
6.1 Introduction.
6.2 OpenMP Extensions for Nested Parallelism.
6.3 OpenMP Extensions for Thread Synchronization.
6.4 Summary.
7. OpenMP for Chip Multiprocessors (Feng Liu and Vipin Chaudhary).
7.1 Introduction.
7.2 3SoC Architecture Overview.
7.3 The OpenMP Conpiler/Translator.
7.4 Extensions to OpenMP for DSEs.
7.5 Optimization for OpenMP.
7.6 Implementation.
7.7 Performance Evaluation.
7.8 Conclusions.
PART 2. ARCHITECTURAL AND SYSTEM SUPPORT.
8. Compiler and Run-Time Parallelization Techniques for Scientific
Computations on Distributed-Memory Parallel Computers (PeiZong Lee,
Cheien-Min Wang and Jan-Jan Wu).
8.1 Introduction.
8.2 Background Material.
8.3 Compiling Regular Programs on DMPCs.
8.4 Compiler and Run-Time Support for Irregular Programs.
8.5 Library Support for Irregular Applications.
8.6 Related Works.
8.7 Concluding Remarks.
9. Enabling Partial-Cache Line Prefetching Through Data Compression (Youtao
Zhang and Rajiv Gupta).
9.1 Introduction.
9.2 Motivation of Partial Cache-Line Perfetching.
9.3 Cache Design Details.
9.4 Experimental Results.
9.5 Related Work.
9.6 Conclusion.
10. MPI Atomicity and Concurrent Overlapping I/O (Wei-Keng Liao, Alok
Choudhary, Kenin Coloma, Lee Ward, Eric Russell and Neil Pundit).
10.1 Introduction.
10.2 Concurrent Overlapping I/O.
10.3 Implementation Strategies.
10.4 Experiment Results.
10.5 Summary.
11. Code Tiling: One Size Fits All (Jingling Xue and Qingguang Huang).
11.1 Introduction.
11.2 Cache Model.
11.3 Code Tiling.
11.4 Data Tiling.
11.5 Finding Optimal Tile Sizes.
11.6 Experimental Results.
11.7 Related Work.
11.8 Conclusion.
12. Data Conversion for Heterogeneous Migration/Checkpointing (Hai Jiang,
Vipin Chaudhary and John Paul Walters).
12.1 Introduction.
12.2 Migration and Checkpointing.
12.3 Data Conversion.
12.4 Coarse-Grain Tagged RMR in MigThread.
12.5 Microbenchmarks and Experiments.
12.6 Related Work.
12.7 Conclusions and Future Work.
13. Receiving-Message Prediction and Its Speculative Execution (Takanobu
Baba, Takashi Yokota, Kamemitsu Ootsu, Fumihitto Furukawa and Yoshiyuki
Iwamoto).
13.1 Background.
13.2 Receiving-Message Prediction Method.
13.3 Implementation of the Method in the MIPI Libraries.
13.4 Experimental Results.
13.5 Conclusing Remarks.
14. An Investigation of the Applicability of Distributed FPGAs to
High-Performance Computing (John P. Morrison, Padraig O'Dowd and Philip D.
Healy).
14.1 Introduction.
14.2 High Performance Computing with Cluster Computing.
14.3 Reconfigurable Computing with EPGAs.
14.4 DRMC: A Distributed Reconfigurable Metacomputer.
14.5 Algorithms Suited to the Implementation on FPGAs/DRMC.
14.6 Algorithms Not Suited to the Implementation on FPGAs/DRMC.
14.7 Summary.
PART 3. SCHEDULING AND RESOURCE MANAGEMENT.
15. Bandwidth-Aware Resource Allocation for Heterogeneous Computing Systems
to Maximize Throughput (Bo Hong and Viktor K. Prasanna).
15.1 Introduction.
15.2 Related Work.
15.3 Systems Model and Problem Statement.
15.4 Resource Allocation to Maximize System Throughput.
15.5 Experimental Results.
15.6 Conclusion.
16. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
(Christos D. Antonopoulos, Dimitrios S., Nikolopoulos and Theeodore S.
Papatheodorou).
16.1 Introduction.
16.2 Related Work.
16.3 The Implications of Bus Bandwidth for Application Performance.
16.4 Scheduling Policies for Preserving Bus Bandwidth.
16.5 Experimental Evaluation.
16.6 Conclusions.
17. Toward Performance Guarantee of Dynamic Task Scheduling of a
Parameter-Sweep Application onto a Computational Grid (Noriyuki Fujimoto
and Kenichi Hagihara).
17.1 Introduction.
17.2 A Grid Scheduling Model.
17.3 Related Works.
17.4 The Proposed Algorithm RR.
17.5 The Performance Guarantee of the Proposed Algorithm.
17.6 Conclusion.
18. Performance Study of Reliability Maximization and Turnaround
Minimization with GA-based Task Allocation in DCS (Deo Prakash Vidyarthi,
Anil Kumar Tripathi, Biplab Kumer Sarker, Kirti Rani and Laurence T. Yang).
18.1 Introduction.
18.2 GA for Task Allocation.
18.3 The Algorithm.
18.4 Illustrative Examples.
18.5 Discussions and Conclusion.
19. Toward Fast and Efficient Compile-Time Task Scheduling in Heterogeneous
Computing Systems (Tarek Hagras and Jan Janecek).
19.1 Introduction.
19.2 Problem Definition.
19.3 The Suggested Algorithm.
19.4 Heterogeneous Systems Scheduling Heuristics.
19.5 Experimental Results and Discussion.
19.6 Conclusion.
20. An On-Line Approach for Classifying and Extracting Application Behavior
on Linux (Luciano José Senger, Rodrigo Fernandes de Mello, Marcos José
Santana, Regina Helena Carlucci Santana and Laurence Tianruo Yang).
20.1 Introduction.
20.2 Related Work.
20.3 Information Acquisition.
20.4 Linux Process Classification Model.
20.5 Results.
20.6 Evaluation of The Model Intrusion on the System Performance.
20.7 Conclusions.
PART 4. CLUSTERS AND GRID COMPUTING.
21. Peer-to-Peer Grid Computing and a .NET-Based Alchemi Framework (Akshay
Luther, Rajkumar Buyya, Rajiv Ranjan and Srikumar Venugopal).
21.1 Introduction.
21.2 Background.
21.3 Desktop Grid Middleware Considerations.
21.4 Representation Desktop Grid Systems.
21.5 Alchemi Desktop Grid Framework.
21.6 Alchemi Design and Implementation.
21.7 Alchemi Performance Evaluation.
21.8 Summary and Future Work.
22. Global Grids and Software Toolkits: A Study of Four Grid Middleware
Technologies (Parvin Asadzadeh, Rajkumar Buyya, Chun Ling Kei, Deepa Nayar
and Srikumar Venugopal).
22.1 Introduction.
22.2 Overview of Grid Middleware Systems.
22.3 Unicore.
22.4 Globus.
22.5 Legion.
22.6 Gridbus.
22.7 Implementation of UNICORE Adaptor for Gridbus Broker.
22.8 Comparison of Middleware Systems.
22.9 Summary.
23. High-Performance Computing on Clusters: The Distributed JVM Approach
(Wenzhang Zhu, Weijian Fang, Cho-Li Wang and Francis C. M. Lau).
23.1 Background.
23.2 Distributed JVM.
23.3 JESSICA2 Distributed JVM.
23.4 Performance Analysis.
23.5 Related Work.
23.6 Summary.
24. Data Grids: Supporting Data-Intensive Applications in Wide-Area
Networks (Xiao Qin and Hong Jiang).
24.1 Introduction.
24.2 Data Grid Services.
24.3 High-Performance Data Grid.
24.4 Security Issues.
24.5 Open Issues.
24.6 Conclusions.
25. Application I/O on a Parallel File System for Linux Clusters (Dheeraj
Bhaardwaj).
25.1 Introduction.
25.2 Application I/O.
25.3 Parallel I/O System Software.
25.4 Standard Unix & Parallel I/O.
25.5 Example: Seismic Imaging.
25.6 Discussion and Conclusion.
26. One Teraflop Achieved with a Geographically Distributed Linux Cluster
(Peng Wang, George Turner, Steven Simms, Dave Hart, Mary Papakhiam and
Craig Stewart).
26.1 Introduction.
26.2 Hardware and Software Setup.
26.3 System Tuning and Benchmark Results.
26.4 Performance Costs and Benefits.
27. A Grid-Based Distributed Simulation of Plasma Turbulence (Beniamino Di
Martino, Salvatore Venticinque, Sergio Criguglio, Giulana Fogaccia and
Gregorio Vlad).
27.1 Introduction.
27.2 MPI Implementation of The Internode Domain Decomposition.
27.3 Integration of The Internode Domain Decomposition with Intranode
Particle Decomposition Strategies.
27.4 The MPICH-G2 Implementation.
27.5 Conclusions.
28. Evidence-Aware Trust Model for Dynamic Services (Ali Shaikh Ali, Omer
F. Rana and Rashid J. Al-Ali).
28.1 Motivation For Evaluating Trust.
28.2 Service Trust-What Is It?
28.3 Evidence-Aware Trust Model.
28.4 The System Life Cycle.
28.5 Conclusion.
PART 5. PEER-TO-PEER COMPUTING.
29. Resource Discovery in Peer-to-Peer Infrastructures (Huang-Chang Hsiao
and Chung-Ta King).
29.1 Introduction.
29.2 Design Requirements.
29.3 Unstructured P2P Systems 4.
29.4 Structured P2P Systems.
29.5 Advanced Resource Discovery for Structured P2P Systems.
29.6 Summary.
30. Hybrid Periodical Flooding in Unstructured Peer-to-Peer Networks
(Yunhao Liu, Li Xiao, Lionel M. Ni and Zhenyun Zhuang).
30.1 Introduction.
30.2 Serarch Mechanisms.
30.3 Hybrid Periodical Flooding.
30.4 Simulation Methodology.
30.5 Performance Evaluation.
30.6 Conclusion.
31. HIERAS: A DHT-Based Hierarchical P2P Routing Algorithm (Zhiyong Xu,
Yiming Hu and Laxmi Bhuyan).
31.1 Introduction.
31.2 Hierarchical P2P Architecture.
31.3 System Design.
31.4 Performance Evaluation.
31.5 Related Works.
31.6 Summary.
32. Flexible and Scalable Group Communication Model for Peer-to-Peer
Systems (Tomoya Enokido and Makoto Takizawa).
32.1 Introduction.
32.2 Group of Agents.
32.3 Functions of Group Protocol.
32.4 Autonomic Group Protocol.
32.5 Retransmission.
32.6 Conclusion.
PART 6. WIRELESS AND MOBILE COMPUTING.
33. Study of Cache-Enhanced Dynamic Movement-Based Location Management
Schemes for 3G Cellular Networks (Krishna Priya Patury, Yi Pan, Xiaola Lin,
Yang Xiao and Jie Li).
33. 1 Introduction.
33.2 Location Management with and without Cache.
33.3 The Cache-Enhanced Location Management Scheme.
33.4 Simulation Results and Analysis.
33.5 Conclusion.
34. Maximizing Multicast Lifetime in Wireless Ad Hoc Networks (Guofeng Deng
and Sandeep K. S. Gupta).
34.1 Introduction.
34.2 Energy Consumption Model In WANETs.
34.3 Definitions of Maximum Multicast Lifetime.
34.4 Maximum Multicast Lifetime of The Network Using Single Tree (MMLM).
34.5 Maximum Multicast Lifetime of The Network Using Multiple Trees (MMLM).
34.6 Summary.
35. A QoS-Aware Scheduling Algorithm for Bluetooth Scatternets (Young Man
Kim, Ten H. Lai and Anish Arora).
35.1 Introduction.
35.2 Perfect Scheduling Problem for Bipartite Scatternet.
35.3 Perfect Assignment Scheduling Algorithm for Bipartite Scatternets.
35.4 Distributed, Local, and Incremental Scheduling Algorithms.
35.5 Performance and QOS Analysis.
35.6 Conclusion.
PART 7. HIGH PERFORMANCE APPLICATIONS.
36. A Workload Partitioner for Heterogeneous Grids (Daniel J. Harvey, Sajal
K. Das and Rupak Biswas).
36.1 Introduction.
36.2 Preliminaries.
36.3 The MinEX Partitioner.
36.4 N-Body Application.
36.5 Experimental Study.
36.6 Conclusion.
37. Building a User-Level Grid for Bag-of-Tasks Applications (Walfredo
Cirne, Francisco Brasileiro, Daniel Paranhos, Lauro Costa, Elizeu
Santos-Neto and Carla Osthoff).
37.1 Introduction.
37.2 Design Goals.
37.3 Architecture.
37.4 Working Environment.
37.5 Scheduling.
37.6 Implementation.
37.7 Performance Evaluation.
37.8 Conclusions and Future Work.
38. An Efficient Parallel Method for Calculating the Smarandache Function
(Sabin Tabirca, Tatiana Tabirca, Kieran Reynolds and Laurence T. Yang).
38.1 Introduction.
38.2 Computing in Parallel.
38.3 Experimental Results.
38.4 Conclusion.
39. Design, Implementation and Deployment of a Commodity Cluster for
Peirodic Comparison of Gene Sequences (Anita M. Orendt, Brian Haymore,
David Richardson, Sofia Robb, Alejandro Sanchez Alvarado and Julio C.
Facelli).
39.1 Introduction.
39.2 System Requirements and Design.
39.3 Performance.
39.4 Conclusions.
40. A Hierarchical Distributed Shared-Memory Parallel Branch & Bound
Application with PVM and OpenMP on Multiprocessor Clusters (Rocco Aversa,
Beniamino Di Martino, Nicola Mazzocca and Salvatore Venticinque).
40.1 Introduction.
40.2 The B&B Parallel Application.
40.3 The OpenMP Extension.
40.4 Experimental Results.
40.5 Conclusions.
41. IP Based Telecommunication Services (Anna Bonifacio and G. Spinillo).
41.1 Introduction.
Index.