Distributed-memory multiprocessing systems (DMS), such as Intel's hypercubes, the Paragon, Thinking Machine's CM-5, and the Meiko Computing Surface, have rapidly gained user acceptance and promise to deliver the computing power required to solve the grand challenge problems of Science and Engineering. These machines are relatively inexpensive to build, and are potentially scalable to large numbers of processors. However, they are difficult to program: the non-uniformity of the memory which makes local accesses much faster than the transfer of non-local data via message-passing operations…mehr
Distributed-memory multiprocessing systems (DMS), such as Intel's hypercubes, the Paragon, Thinking Machine's CM-5, and the Meiko Computing Surface, have rapidly gained user acceptance and promise to deliver the computing power required to solve the grand challenge problems of Science and Engineering. These machines are relatively inexpensive to build, and are potentially scalable to large numbers of processors. However, they are difficult to program: the non-uniformity of the memory which makes local accesses much faster than the transfer of non-local data via message-passing operations implies that the locality of algorithms must be exploited in order to achieve acceptable performance. The management of data, with the twin goals of both spreading the computational workload and minimizing the delays caused when a processor has to wait for non-local data, becomes of paramount importance. When a code is parallelized by hand, the programmer must distribute the program's work and data to the processors which will execute it. One of the common approaches to do so makes use of the regularity of most numerical computations. This is the so-called Single Program Multiple Data (SPMD) or data parallel model of computation. With this method, the data arrays in the original program are each distributed to the processors, establishing an ownership relation, and computations defining a data item are performed by the processors owning the data.Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Die Herstellerinformationen sind derzeit nicht verfügbar.
Inhaltsangabe
2 The Weight Finder - An Advanced Profiler for Fortran Programs.- 2.1 Introduction.- 2.2 Prerequisite.- 2.3 The Weight Finder.- 2.4 Adaptation of Profile Data.- 2.5 Conclusion and Future Work.- 3 Predicting Execution Times of Sequential Scientific Kernels.- 3.1 Motivation.- 3.2 Deriving time formulae for code fragments.- 3.3 Obtaining a platform model.- 3.4 Examples.- 3.5 Discussion and Further Work.- 4 Isolating the Reasons for the Performance of Parallel Machines on Numerical Programs.- 4.1 Introduction.- 4.2 Micro Measurements.- 4.3 Measurements.- 4.4 Algorithms.- 4.5 Analysis of the Programs.- 4.6 Conclusion.- 5 Targeting Transputer Systems, Past and Future.- 5.1 Introduction.- 5.2 The T800 family.- 5.3 The T9000 family.- 5.4 The Chameleon family.- 6 Adaptor: A Compilation System for Data Parallel Fortran Programs.- 6.1 Introduction.- 6.2 The Adaptor Compilation System.- 6.3 Results of Benchmark Codes.- 6.4 Results of Application Codes.- 6.5 Summary.- 7 SNAP! Prototyping a Sequential and Numerical Application Parallelizer.- 7.1 Introduction.- 7.2 Compiler.- 7.3 Conclusions.- 8 Knowledge-Based Automatic Parallelization by Pattern Recognition.- 8.1 Introduction and Overview.- 8.2 Preprocessing the Source Code.- 8.3 Which Patterns are Supported?.- 8.4 Pattern Recognition: A Detailed View.- 8.5 A Parallel Algorithm for each Pattern.- 8.6 Alignment and Partitioning.- 8.7 Determining Cost Functions: Estimating and Benchmarking.- 8.8 Implementation and Future Extensions.- 8.9 Conclusions.- 9 Automatic Data Layout for Distributed-Memory Machines in the D Programming Environment.- 9.1 Introduction.- 9.2 Compilation system.- 9.3 Dynamic Data Layout: Two Examples.- 9.4 Towards Dynamic Data Layout.- 9.5 Related Work.- 9.6 Summary and Future Work.- 10 SubspaceOptimizations.- 10.1 Introduction.- 10.2 Subspaces.- 10.3 Subspace Changes.- 10.4 Subspace Optimizations.- 10.5 Subspaces Optimization Compared to Alignment.- 10.6 Summary.- 10.7 Acknowledgments.- 11 Data and Process Alignment in Modula-2*.- 11.1 Introduction.- 11.2 Modula-2*.- 11.3 Alignment in Modula-2*.- 11.4 Arrangement Graphs and Conflicts.- 11.5 Cost Considerations.- 11.6 Example.- 11.7 Conclusion.- 12 Automatic Parallelization for Distributed Memory Multiprocessors.- 12.1 Introduction.- 12.2 Related Work.- 12.3 Overview.- 12.4 Parallelization Strategy.- 12.5 Branch-and-Bound Algorithm.- 12.6 Performance Estimator.- 12.7 Prototype Implementation and Results.- 12.8 Conclusions and Further Research.- 12.9 Acknowledgements.- A Trademarks.
2 The Weight Finder - An Advanced Profiler for Fortran Programs.- 2.1 Introduction.- 2.2 Prerequisite.- 2.3 The Weight Finder.- 2.4 Adaptation of Profile Data.- 2.5 Conclusion and Future Work.- 3 Predicting Execution Times of Sequential Scientific Kernels.- 3.1 Motivation.- 3.2 Deriving time formulae for code fragments.- 3.3 Obtaining a platform model.- 3.4 Examples.- 3.5 Discussion and Further Work.- 4 Isolating the Reasons for the Performance of Parallel Machines on Numerical Programs.- 4.1 Introduction.- 4.2 Micro Measurements.- 4.3 Measurements.- 4.4 Algorithms.- 4.5 Analysis of the Programs.- 4.6 Conclusion.- 5 Targeting Transputer Systems, Past and Future.- 5.1 Introduction.- 5.2 The T800 family.- 5.3 The T9000 family.- 5.4 The Chameleon family.- 6 Adaptor: A Compilation System for Data Parallel Fortran Programs.- 6.1 Introduction.- 6.2 The Adaptor Compilation System.- 6.3 Results of Benchmark Codes.- 6.4 Results of Application Codes.- 6.5 Summary.- 7 SNAP! Prototyping a Sequential and Numerical Application Parallelizer.- 7.1 Introduction.- 7.2 Compiler.- 7.3 Conclusions.- 8 Knowledge-Based Automatic Parallelization by Pattern Recognition.- 8.1 Introduction and Overview.- 8.2 Preprocessing the Source Code.- 8.3 Which Patterns are Supported?.- 8.4 Pattern Recognition: A Detailed View.- 8.5 A Parallel Algorithm for each Pattern.- 8.6 Alignment and Partitioning.- 8.7 Determining Cost Functions: Estimating and Benchmarking.- 8.8 Implementation and Future Extensions.- 8.9 Conclusions.- 9 Automatic Data Layout for Distributed-Memory Machines in the D Programming Environment.- 9.1 Introduction.- 9.2 Compilation system.- 9.3 Dynamic Data Layout: Two Examples.- 9.4 Towards Dynamic Data Layout.- 9.5 Related Work.- 9.6 Summary and Future Work.- 10 SubspaceOptimizations.- 10.1 Introduction.- 10.2 Subspaces.- 10.3 Subspace Changes.- 10.4 Subspace Optimizations.- 10.5 Subspaces Optimization Compared to Alignment.- 10.6 Summary.- 10.7 Acknowledgments.- 11 Data and Process Alignment in Modula-2*.- 11.1 Introduction.- 11.2 Modula-2*.- 11.3 Alignment in Modula-2*.- 11.4 Arrangement Graphs and Conflicts.- 11.5 Cost Considerations.- 11.6 Example.- 11.7 Conclusion.- 12 Automatic Parallelization for Distributed Memory Multiprocessors.- 12.1 Introduction.- 12.2 Related Work.- 12.3 Overview.- 12.4 Parallelization Strategy.- 12.5 Branch-and-Bound Algorithm.- 12.6 Performance Estimator.- 12.7 Prototype Implementation and Results.- 12.8 Conclusions and Further Research.- 12.9 Acknowledgements.- A Trademarks.
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826