Albert Y. Zomaya, Hassan B. Diab
Dependable Computing Systems
Paradigms, Performance Issues, and Applications
Albert Y. Zomaya, Hassan B. Diab
Dependable Computing Systems
Paradigms, Performance Issues, and Applications
- Gebundenes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
This edited volume is a repository of case studies, authored by experts and well-reputed researchers in the field. It deals with a variety of difficult problems in creating dependable computing systems.
A team of recognized experts leads the way to dependable computing systems
With computers and networks pervading every aspect of daily life, there is an ever-growing demand for dependability. In this unique resource, researchers and organizations will find the tools needed to identify and engage state-of-the-art approaches used for the specification, design, and assessment of dependable…mehr
Andere Kunden interessierten sich auch für
- Oliver SinnenTask Scheduling for Parallel Systems146,99 €
- Hagit AttiyaDistributed Computing180,99 €
- Michael Di StefanoDistributed Data Management for Grid Computing170,99 €
- Nicola SantoroDesign and Analysis of Distributed Algorithms226,99 €
- Patrick H. GarrettAdvanced Instrumentation and Computer I/O Design160,99 €
- Enrique AlbaParallel Metaheuristics195,99 €
- Wolfgang EmmerichEngineering Distributed Objects94,99 €
-
-
-
This edited volume is a repository of case studies, authored by experts and well-reputed researchers in the field. It deals with a variety of difficult problems in creating dependable computing systems.
A team of recognized experts leads the way to dependable computing systems
With computers and networks pervading every aspect of daily life, there is an ever-growing demand for dependability. In this unique resource, researchers and organizations will find the tools needed to identify and engage state-of-the-art approaches used for the specification, design, and assessment of dependable computer systems.
The first part of the book addresses models and paradigms of dependable computing, and the second part deals with enabling technologies and applications. Tough issues in creating dependable computing systems are also tackled, including:
_ Verification techniques
_ Model-based evaluation
_ Adjudication and data fusion
_ Robust communications primitives
_ Fault tolerance
_ Middleware
_ Grid security
_ Dependability in IBM mainframes
_ Embedded software
_ Real-time systems
Each chapter of this contributed work has been authored by a recognized expert. This is an excellent textbook for graduate and advanced undergraduate students in electrical engineering, computer engineering, and computer science, as well as a must-have reference that will help engineers, programmers, and technologists develop systems that are secure and reliable.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
A team of recognized experts leads the way to dependable computing systems
With computers and networks pervading every aspect of daily life, there is an ever-growing demand for dependability. In this unique resource, researchers and organizations will find the tools needed to identify and engage state-of-the-art approaches used for the specification, design, and assessment of dependable computer systems.
The first part of the book addresses models and paradigms of dependable computing, and the second part deals with enabling technologies and applications. Tough issues in creating dependable computing systems are also tackled, including:
_ Verification techniques
_ Model-based evaluation
_ Adjudication and data fusion
_ Robust communications primitives
_ Fault tolerance
_ Middleware
_ Grid security
_ Dependability in IBM mainframes
_ Embedded software
_ Real-time systems
Each chapter of this contributed work has been authored by a recognized expert. This is an excellent textbook for graduate and advanced undergraduate students in electrical engineering, computer engineering, and computer science, as well as a must-have reference that will help engineers, programmers, and technologists develop systems that are secure and reliable.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- Artikelnr. des Verlages: 14667422000
- 1. Auflage
- Seitenzahl: 688
- Erscheinungstermin: 1. September 2005
- Englisch
- Abmessung: 242mm x 162mm x 37mm
- Gewicht: 1055g
- ISBN-13: 9780471674221
- ISBN-10: 0471674222
- Artikelnr.: 12987284
- Wiley Series on Parallel and Distributed Computing
- Verlag: Wiley & Sons
- Artikelnr. des Verlages: 14667422000
- 1. Auflage
- Seitenzahl: 688
- Erscheinungstermin: 1. September 2005
- Englisch
- Abmessung: 242mm x 162mm x 37mm
- Gewicht: 1055g
- ISBN-13: 9780471674221
- ISBN-10: 0471674222
- Artikelnr.: 12987284
HASSAN B. DIAB, PhD, is Professor of Electrical and Computer Engineering, Faculty of Engineering and Architecture, American University of Beirut (AUB). He is currently Dean of the School of Engineering at AUB and Acting President of Dhofar University, Sultanate of Oman. He is the Associate Editor of Simulation: Transactions of the Society for Modeling and Simulation International and a founding member of the Arab Computer Society. ALBERT Y. ZOMAYA, PhD, is the CISCO Systems Chair Professor of Internetworking, School of Information Technologies, The University of Sydney, and Deputy Director for Information Technology of the Sydney University Biological Informatics and Technology Centre. Dr. Zomaya has been the chair of the IEEE Technical Committee on Parallel Processing and has been awarded the IEEE Computer Society's Meritorious Service Award.
Preface xxiii
Contributors xxxv
Acknowledgments xxxix
Part I Models and Paradigms 1
1. Formal Verification Techniques for Digital Systems 3
Masahiro Fujita, Satoshi Komatsu, and Hiroshi Saito
1.1 Introduction 3
1.2 Basic Techniques for Formal Verification 4
1.3 Verification Techniques for Combinational Circuit Equivalence 7
1.4 Verification Techniques for Sequential Circuits 14
1.5 Summary 24
References 24
2. Tolerating Arbitrary Failures With State Machine Replication 27
Assia Doudou, Benoît Garbinato, and Rachid Guerraoui
2.1 Introduction 27
2.2 System Model 31
2.3 Total Order Broadcast 32
2.4 Weak Interactive Consistency 36
2.5 Muteness Failure Detector 44
2.6 Concluding Remarks 52
References 55
3. Model-Based Evaluation as a Support to the Design of Dependable Systems
57
Andrea Bondavalli, Silvano Chiaradonna, and Felicita di Giandomenico
3.1 Introduction 57
3.2 The Role of Model-Based Evaluation in the Development of Dependable
Systems 58
3.3 Dependability Modeling Methodologies and Tools 61
3.4 Analytical Modeling to Support Design Decisions 68
3.5 Analytical Modeling to Support Fault Removal During Operational Life 76
3.6 Summary 82
References 82
4. Voting: A Paradigm for Adjudication and Data Fusion in Dependable
Systems 87
Behrooz Parhami
4.1 Introduction 87
4.2 Voting in Dependable Systems 88
4.3 Voting Schemes and Problems 94
4.4 Voting for Data Fusion 98
4.5 Implementation Issues 102
4.6 Unifying Concepts 107
4.7 Conclusion 110
References 111
5. Robust Communication Primitives for Wireless Sensor Networks 115
Amol Bakshi and Viktor K. Prasanna
5.1 Introduction 115
5.2 Defining Realistic Models 117
5.3 Our System Model 119
5.4 Permutation Routing in a Single-hop Topology: State-of-the-Art 121
5.5 An Energy-Efficient Protocol Using a Low-Power Control Channel 125
5.6 Our Routing Protocol for a Faulty Network 132
5.7 Our Generalized Protocol for a Multichannel Network 135
5.8 Concluding Remarks 140
References 140
6. System-Level Diagnosis and Implications in Current Context 143
Arun K. Somani
6.1 Issues in Large and Complex Computing Systems 143
6.2 System-Level Diagnosis 145
6.3 Classification of Diagnosable Systems 148
6.4 Diagnosability Algorithms 157
6.5 Diagnosis Algorithms 160
6.6 Application of System-Level Diagnosis Algorithm 165
6.7 Summary and Conclusions 166
References 167
7. Predicate Detection in Asynchronous Systems With Crash Failures 171
Felix C. Gärtner and Stefan Pleisch
7.1 Introduction 171
7.2 Predicate Detection in Fault-Free Environments 173
7.3 Failures and Failure Detection 177
7.4 Predicate Detection in Faulty Environments 183
7.5 Solving Predicate Detection in Faulty Environments 194
7.6 Conclusion 209
References 211
8. Fault Tolerance Against Design Faults 213
Lorenzo Strigini
8.1 Introduction 213
8.2 Examples and Principles 215
8.3 Potential and Actual Benefits 225
8.4 Design Solutions 230
8.5 Summary 236
References 238
9. Formal Methods for Safety Critical Systems 243
Ali E. Abdallah, Jonathan P. Bowen, and Nimal Nissanke
9.1 Introduction 243
9.2 Specification of Safety 245
9.3 Historical Background 247
9.4 Safety 248
9.5 Application Areas 253
9.6 Specification Framework 256
9.7 System State and Behavior 262
9.8 Discussion 265
9.9 Conclusion 268
References 269
Part II Enabling Technologies and Applications 273
10. Dependability Support in Wireless Sensor Networks 275
Denis Gracanin, Mohamed Eltoweissy, Stephan Olariu, and Ashraf Wadaa
10.1 Motivation and Background 276
10.2 Service Centric Model 279
10.3 Conclusion 283
References 283
11. Availability Modeling in Practice 285
Kishor S. Trivedi, Archana Sathaye, and Srinivasan Ramani
11.1 Introduction 285
11.2 Modeling Approaches 286
11.3 Composite Availability and Performance Model 292
11.4 Digital Equipment Corporation Case Study 297
11.5 Conclusion 315
References 315
12. Experimental Dependability Evaluation 319
João Gabriel Silva and Henrique Madeira
12.1 Field Measurement 321
12.2 Fault Injection 323
12.3 Robustness Testing 337
12.4 Recent Developments: Dependability Benchmarking 340
12.5 Conclusion 342
References 343
13. A Dependable Architecture for Telemedicine in Support of Disaster
Relief 349
Stephan Olariu, Kurt Maly, Edwin C. Foudriat, Sameh M. Yamany, and Thomas
Luckenbach
13.1 Introduction 349
13.2 Telemedicine-State of the Art 350
13.3 The WIRM System Architecture 352
13.4 A Novel 3D Data Compression Technique 356
13.5 Interactive Remote Visualization 358
13.6 An Overview of H3M-Our Wireless Architecture 359
13.7 Concluding Remarks 366
References 366
14. An Overview of IBM Mainframe Dependable Computing: From System/360 to
Series 369
Lisa Spainhower
14.1 Introduction 369
14.2 Error Detection and Fault Isolation 375
14.3 Instruction Level Retry 380
14.4 Online Repair 386
14.5 Summary 391
References 392
15. Tracking the Propagation of Data Errors in Software 395
Martin Hiller, Arshad Jhumka, and Neeraj Suri
15.1 Introduction 395
15.2 Target System Model 396
15.3 Overview of the Tool Suite 397
15.4 Setup: Experiment Design and Target Instrumentation 401
15.5 Injection: Running Experiments 407
15.6 Analysis: Obtaining Error Propagation Characteristics 408
15.7 Example Results Generated by Propane 409
15.8 Propane's Attributes and Main Characteristics 414
15.9 Summary 415
References 416
16. Integrated Reliable Real-Time Systems 419
Mohamed Younis
16.1 Background 421
16.2 Integration Issues 425
16.3 Few Forward Steps 429
16.4 An Example Aerospace Application 432
16.5 Conclusion 442
References 443
17. Network Resilience by Emergent Behavior from Simple Autonomous Agents
449
Bjarne E. Helvik and Otto Wittner
17.1 Introduction 449
17.2 Network Resilience 450
17.3 Handling Routing and Resources in Networks by Emergence 457
17.4 Cross-Entropy Based Path Finding 460
17.5 Finding "Best-Effort" Primary/Backup Paths 468
17.6 Discussion 473
17.7 Concluding Remarks 475
References 475
18. Safeguarding Critical Infrastructures 479
David Gamez, Simin Nadjm-Tehrani, John Bigham, Claudio Balducelli, Kalle
Burbeck, and Tobias Chyssler
18.1 Introduction 479
18.2 Attacks, Failures, and Accidents 480
18.3 Solutions 483
18.4 The Safeguard Architecture 486
18.5 Future Work 497
18.6 Conclusion 497
References 498
19. Impact of Traffic Self-Similarity on the Performance of Routing
Algorithms in Multicomputer Systems 501
Geyong Min, Mohamed Ould-Khaoua, Demetres D. Kouvatsos, and Irfan U. Awan
19.1 Introduction 502
19.2 The k-ary n-Cube and Dimension-Ordered Routing 504
19.3 Modeling of Traffic Self-Similarity 506
19.4 The Analytical Model 507
19.5 Impact of Self-Similar Traffic on Routing Performance 518
19.6 Conclusions 519
References 520
Appendix 19.1: Notation 523
20. Some Observations on Adaptive Meta-Heuristics for Routing in Datagram
Networks 525
Albert Y. Zomaya, Tysun Chan, and Miro Kraetzl
20.1 Introduction 525
20.2 The Routing Problem 526
20.3 Genetic Algorithms and Routing 532
20.4 Genetic Routing Protocol Design 536
20.5 Genetic Routing Protocol Implementation 547
20.6 Results and Analysis 552
20.7 Conclusions 560
References 561
21. Reconfigurable Computing for Cryptography 563
Hassan B. Diab
21.1 Introduction 564
21.2 Reconfigurable Computing 565
21.3 AES Cryptography 576
21.4 Case Study: The Twofish Cipher on a Dynamic RC System 579
21.5 Future of RC 589
21.6 Conclusion 590
References 591
22. Dependability of Reconfigurable Computing 597
Mohamed Younis, I-Hong Yeh, Nicholas Kyriakopoulos, Nikitas Alexandridis,
and Tarek El-Ghazawi
22.1 FPGA Preliminaries 598
22.2 FPGA Fault Taxonomy 603
22.3 Handling FPGA Failures 608
22.4 Conclusion and Open Issues 621
References 622
Index 627
Contributors xxxv
Acknowledgments xxxix
Part I Models and Paradigms 1
1. Formal Verification Techniques for Digital Systems 3
Masahiro Fujita, Satoshi Komatsu, and Hiroshi Saito
1.1 Introduction 3
1.2 Basic Techniques for Formal Verification 4
1.3 Verification Techniques for Combinational Circuit Equivalence 7
1.4 Verification Techniques for Sequential Circuits 14
1.5 Summary 24
References 24
2. Tolerating Arbitrary Failures With State Machine Replication 27
Assia Doudou, Benoît Garbinato, and Rachid Guerraoui
2.1 Introduction 27
2.2 System Model 31
2.3 Total Order Broadcast 32
2.4 Weak Interactive Consistency 36
2.5 Muteness Failure Detector 44
2.6 Concluding Remarks 52
References 55
3. Model-Based Evaluation as a Support to the Design of Dependable Systems
57
Andrea Bondavalli, Silvano Chiaradonna, and Felicita di Giandomenico
3.1 Introduction 57
3.2 The Role of Model-Based Evaluation in the Development of Dependable
Systems 58
3.3 Dependability Modeling Methodologies and Tools 61
3.4 Analytical Modeling to Support Design Decisions 68
3.5 Analytical Modeling to Support Fault Removal During Operational Life 76
3.6 Summary 82
References 82
4. Voting: A Paradigm for Adjudication and Data Fusion in Dependable
Systems 87
Behrooz Parhami
4.1 Introduction 87
4.2 Voting in Dependable Systems 88
4.3 Voting Schemes and Problems 94
4.4 Voting for Data Fusion 98
4.5 Implementation Issues 102
4.6 Unifying Concepts 107
4.7 Conclusion 110
References 111
5. Robust Communication Primitives for Wireless Sensor Networks 115
Amol Bakshi and Viktor K. Prasanna
5.1 Introduction 115
5.2 Defining Realistic Models 117
5.3 Our System Model 119
5.4 Permutation Routing in a Single-hop Topology: State-of-the-Art 121
5.5 An Energy-Efficient Protocol Using a Low-Power Control Channel 125
5.6 Our Routing Protocol for a Faulty Network 132
5.7 Our Generalized Protocol for a Multichannel Network 135
5.8 Concluding Remarks 140
References 140
6. System-Level Diagnosis and Implications in Current Context 143
Arun K. Somani
6.1 Issues in Large and Complex Computing Systems 143
6.2 System-Level Diagnosis 145
6.3 Classification of Diagnosable Systems 148
6.4 Diagnosability Algorithms 157
6.5 Diagnosis Algorithms 160
6.6 Application of System-Level Diagnosis Algorithm 165
6.7 Summary and Conclusions 166
References 167
7. Predicate Detection in Asynchronous Systems With Crash Failures 171
Felix C. Gärtner and Stefan Pleisch
7.1 Introduction 171
7.2 Predicate Detection in Fault-Free Environments 173
7.3 Failures and Failure Detection 177
7.4 Predicate Detection in Faulty Environments 183
7.5 Solving Predicate Detection in Faulty Environments 194
7.6 Conclusion 209
References 211
8. Fault Tolerance Against Design Faults 213
Lorenzo Strigini
8.1 Introduction 213
8.2 Examples and Principles 215
8.3 Potential and Actual Benefits 225
8.4 Design Solutions 230
8.5 Summary 236
References 238
9. Formal Methods for Safety Critical Systems 243
Ali E. Abdallah, Jonathan P. Bowen, and Nimal Nissanke
9.1 Introduction 243
9.2 Specification of Safety 245
9.3 Historical Background 247
9.4 Safety 248
9.5 Application Areas 253
9.6 Specification Framework 256
9.7 System State and Behavior 262
9.8 Discussion 265
9.9 Conclusion 268
References 269
Part II Enabling Technologies and Applications 273
10. Dependability Support in Wireless Sensor Networks 275
Denis Gracanin, Mohamed Eltoweissy, Stephan Olariu, and Ashraf Wadaa
10.1 Motivation and Background 276
10.2 Service Centric Model 279
10.3 Conclusion 283
References 283
11. Availability Modeling in Practice 285
Kishor S. Trivedi, Archana Sathaye, and Srinivasan Ramani
11.1 Introduction 285
11.2 Modeling Approaches 286
11.3 Composite Availability and Performance Model 292
11.4 Digital Equipment Corporation Case Study 297
11.5 Conclusion 315
References 315
12. Experimental Dependability Evaluation 319
João Gabriel Silva and Henrique Madeira
12.1 Field Measurement 321
12.2 Fault Injection 323
12.3 Robustness Testing 337
12.4 Recent Developments: Dependability Benchmarking 340
12.5 Conclusion 342
References 343
13. A Dependable Architecture for Telemedicine in Support of Disaster
Relief 349
Stephan Olariu, Kurt Maly, Edwin C. Foudriat, Sameh M. Yamany, and Thomas
Luckenbach
13.1 Introduction 349
13.2 Telemedicine-State of the Art 350
13.3 The WIRM System Architecture 352
13.4 A Novel 3D Data Compression Technique 356
13.5 Interactive Remote Visualization 358
13.6 An Overview of H3M-Our Wireless Architecture 359
13.7 Concluding Remarks 366
References 366
14. An Overview of IBM Mainframe Dependable Computing: From System/360 to
Series 369
Lisa Spainhower
14.1 Introduction 369
14.2 Error Detection and Fault Isolation 375
14.3 Instruction Level Retry 380
14.4 Online Repair 386
14.5 Summary 391
References 392
15. Tracking the Propagation of Data Errors in Software 395
Martin Hiller, Arshad Jhumka, and Neeraj Suri
15.1 Introduction 395
15.2 Target System Model 396
15.3 Overview of the Tool Suite 397
15.4 Setup: Experiment Design and Target Instrumentation 401
15.5 Injection: Running Experiments 407
15.6 Analysis: Obtaining Error Propagation Characteristics 408
15.7 Example Results Generated by Propane 409
15.8 Propane's Attributes and Main Characteristics 414
15.9 Summary 415
References 416
16. Integrated Reliable Real-Time Systems 419
Mohamed Younis
16.1 Background 421
16.2 Integration Issues 425
16.3 Few Forward Steps 429
16.4 An Example Aerospace Application 432
16.5 Conclusion 442
References 443
17. Network Resilience by Emergent Behavior from Simple Autonomous Agents
449
Bjarne E. Helvik and Otto Wittner
17.1 Introduction 449
17.2 Network Resilience 450
17.3 Handling Routing and Resources in Networks by Emergence 457
17.4 Cross-Entropy Based Path Finding 460
17.5 Finding "Best-Effort" Primary/Backup Paths 468
17.6 Discussion 473
17.7 Concluding Remarks 475
References 475
18. Safeguarding Critical Infrastructures 479
David Gamez, Simin Nadjm-Tehrani, John Bigham, Claudio Balducelli, Kalle
Burbeck, and Tobias Chyssler
18.1 Introduction 479
18.2 Attacks, Failures, and Accidents 480
18.3 Solutions 483
18.4 The Safeguard Architecture 486
18.5 Future Work 497
18.6 Conclusion 497
References 498
19. Impact of Traffic Self-Similarity on the Performance of Routing
Algorithms in Multicomputer Systems 501
Geyong Min, Mohamed Ould-Khaoua, Demetres D. Kouvatsos, and Irfan U. Awan
19.1 Introduction 502
19.2 The k-ary n-Cube and Dimension-Ordered Routing 504
19.3 Modeling of Traffic Self-Similarity 506
19.4 The Analytical Model 507
19.5 Impact of Self-Similar Traffic on Routing Performance 518
19.6 Conclusions 519
References 520
Appendix 19.1: Notation 523
20. Some Observations on Adaptive Meta-Heuristics for Routing in Datagram
Networks 525
Albert Y. Zomaya, Tysun Chan, and Miro Kraetzl
20.1 Introduction 525
20.2 The Routing Problem 526
20.3 Genetic Algorithms and Routing 532
20.4 Genetic Routing Protocol Design 536
20.5 Genetic Routing Protocol Implementation 547
20.6 Results and Analysis 552
20.7 Conclusions 560
References 561
21. Reconfigurable Computing for Cryptography 563
Hassan B. Diab
21.1 Introduction 564
21.2 Reconfigurable Computing 565
21.3 AES Cryptography 576
21.4 Case Study: The Twofish Cipher on a Dynamic RC System 579
21.5 Future of RC 589
21.6 Conclusion 590
References 591
22. Dependability of Reconfigurable Computing 597
Mohamed Younis, I-Hong Yeh, Nicholas Kyriakopoulos, Nikitas Alexandridis,
and Tarek El-Ghazawi
22.1 FPGA Preliminaries 598
22.2 FPGA Fault Taxonomy 603
22.3 Handling FPGA Failures 608
22.4 Conclusion and Open Issues 621
References 622
Index 627
Preface xxiii
Contributors xxxv
Acknowledgments xxxix
Part I Models and Paradigms 1
1. Formal Verification Techniques for Digital Systems 3
Masahiro Fujita, Satoshi Komatsu, and Hiroshi Saito
1.1 Introduction 3
1.2 Basic Techniques for Formal Verification 4
1.3 Verification Techniques for Combinational Circuit Equivalence 7
1.4 Verification Techniques for Sequential Circuits 14
1.5 Summary 24
References 24
2. Tolerating Arbitrary Failures With State Machine Replication 27
Assia Doudou, Benoît Garbinato, and Rachid Guerraoui
2.1 Introduction 27
2.2 System Model 31
2.3 Total Order Broadcast 32
2.4 Weak Interactive Consistency 36
2.5 Muteness Failure Detector 44
2.6 Concluding Remarks 52
References 55
3. Model-Based Evaluation as a Support to the Design of Dependable Systems
57
Andrea Bondavalli, Silvano Chiaradonna, and Felicita di Giandomenico
3.1 Introduction 57
3.2 The Role of Model-Based Evaluation in the Development of Dependable
Systems 58
3.3 Dependability Modeling Methodologies and Tools 61
3.4 Analytical Modeling to Support Design Decisions 68
3.5 Analytical Modeling to Support Fault Removal During Operational Life 76
3.6 Summary 82
References 82
4. Voting: A Paradigm for Adjudication and Data Fusion in Dependable
Systems 87
Behrooz Parhami
4.1 Introduction 87
4.2 Voting in Dependable Systems 88
4.3 Voting Schemes and Problems 94
4.4 Voting for Data Fusion 98
4.5 Implementation Issues 102
4.6 Unifying Concepts 107
4.7 Conclusion 110
References 111
5. Robust Communication Primitives for Wireless Sensor Networks 115
Amol Bakshi and Viktor K. Prasanna
5.1 Introduction 115
5.2 Defining Realistic Models 117
5.3 Our System Model 119
5.4 Permutation Routing in a Single-hop Topology: State-of-the-Art 121
5.5 An Energy-Efficient Protocol Using a Low-Power Control Channel 125
5.6 Our Routing Protocol for a Faulty Network 132
5.7 Our Generalized Protocol for a Multichannel Network 135
5.8 Concluding Remarks 140
References 140
6. System-Level Diagnosis and Implications in Current Context 143
Arun K. Somani
6.1 Issues in Large and Complex Computing Systems 143
6.2 System-Level Diagnosis 145
6.3 Classification of Diagnosable Systems 148
6.4 Diagnosability Algorithms 157
6.5 Diagnosis Algorithms 160
6.6 Application of System-Level Diagnosis Algorithm 165
6.7 Summary and Conclusions 166
References 167
7. Predicate Detection in Asynchronous Systems With Crash Failures 171
Felix C. Gärtner and Stefan Pleisch
7.1 Introduction 171
7.2 Predicate Detection in Fault-Free Environments 173
7.3 Failures and Failure Detection 177
7.4 Predicate Detection in Faulty Environments 183
7.5 Solving Predicate Detection in Faulty Environments 194
7.6 Conclusion 209
References 211
8. Fault Tolerance Against Design Faults 213
Lorenzo Strigini
8.1 Introduction 213
8.2 Examples and Principles 215
8.3 Potential and Actual Benefits 225
8.4 Design Solutions 230
8.5 Summary 236
References 238
9. Formal Methods for Safety Critical Systems 243
Ali E. Abdallah, Jonathan P. Bowen, and Nimal Nissanke
9.1 Introduction 243
9.2 Specification of Safety 245
9.3 Historical Background 247
9.4 Safety 248
9.5 Application Areas 253
9.6 Specification Framework 256
9.7 System State and Behavior 262
9.8 Discussion 265
9.9 Conclusion 268
References 269
Part II Enabling Technologies and Applications 273
10. Dependability Support in Wireless Sensor Networks 275
Denis Gracanin, Mohamed Eltoweissy, Stephan Olariu, and Ashraf Wadaa
10.1 Motivation and Background 276
10.2 Service Centric Model 279
10.3 Conclusion 283
References 283
11. Availability Modeling in Practice 285
Kishor S. Trivedi, Archana Sathaye, and Srinivasan Ramani
11.1 Introduction 285
11.2 Modeling Approaches 286
11.3 Composite Availability and Performance Model 292
11.4 Digital Equipment Corporation Case Study 297
11.5 Conclusion 315
References 315
12. Experimental Dependability Evaluation 319
João Gabriel Silva and Henrique Madeira
12.1 Field Measurement 321
12.2 Fault Injection 323
12.3 Robustness Testing 337
12.4 Recent Developments: Dependability Benchmarking 340
12.5 Conclusion 342
References 343
13. A Dependable Architecture for Telemedicine in Support of Disaster
Relief 349
Stephan Olariu, Kurt Maly, Edwin C. Foudriat, Sameh M. Yamany, and Thomas
Luckenbach
13.1 Introduction 349
13.2 Telemedicine-State of the Art 350
13.3 The WIRM System Architecture 352
13.4 A Novel 3D Data Compression Technique 356
13.5 Interactive Remote Visualization 358
13.6 An Overview of H3M-Our Wireless Architecture 359
13.7 Concluding Remarks 366
References 366
14. An Overview of IBM Mainframe Dependable Computing: From System/360 to
Series 369
Lisa Spainhower
14.1 Introduction 369
14.2 Error Detection and Fault Isolation 375
14.3 Instruction Level Retry 380
14.4 Online Repair 386
14.5 Summary 391
References 392
15. Tracking the Propagation of Data Errors in Software 395
Martin Hiller, Arshad Jhumka, and Neeraj Suri
15.1 Introduction 395
15.2 Target System Model 396
15.3 Overview of the Tool Suite 397
15.4 Setup: Experiment Design and Target Instrumentation 401
15.5 Injection: Running Experiments 407
15.6 Analysis: Obtaining Error Propagation Characteristics 408
15.7 Example Results Generated by Propane 409
15.8 Propane's Attributes and Main Characteristics 414
15.9 Summary 415
References 416
16. Integrated Reliable Real-Time Systems 419
Mohamed Younis
16.1 Background 421
16.2 Integration Issues 425
16.3 Few Forward Steps 429
16.4 An Example Aerospace Application 432
16.5 Conclusion 442
References 443
17. Network Resilience by Emergent Behavior from Simple Autonomous Agents
449
Bjarne E. Helvik and Otto Wittner
17.1 Introduction 449
17.2 Network Resilience 450
17.3 Handling Routing and Resources in Networks by Emergence 457
17.4 Cross-Entropy Based Path Finding 460
17.5 Finding "Best-Effort" Primary/Backup Paths 468
17.6 Discussion 473
17.7 Concluding Remarks 475
References 475
18. Safeguarding Critical Infrastructures 479
David Gamez, Simin Nadjm-Tehrani, John Bigham, Claudio Balducelli, Kalle
Burbeck, and Tobias Chyssler
18.1 Introduction 479
18.2 Attacks, Failures, and Accidents 480
18.3 Solutions 483
18.4 The Safeguard Architecture 486
18.5 Future Work 497
18.6 Conclusion 497
References 498
19. Impact of Traffic Self-Similarity on the Performance of Routing
Algorithms in Multicomputer Systems 501
Geyong Min, Mohamed Ould-Khaoua, Demetres D. Kouvatsos, and Irfan U. Awan
19.1 Introduction 502
19.2 The k-ary n-Cube and Dimension-Ordered Routing 504
19.3 Modeling of Traffic Self-Similarity 506
19.4 The Analytical Model 507
19.5 Impact of Self-Similar Traffic on Routing Performance 518
19.6 Conclusions 519
References 520
Appendix 19.1: Notation 523
20. Some Observations on Adaptive Meta-Heuristics for Routing in Datagram
Networks 525
Albert Y. Zomaya, Tysun Chan, and Miro Kraetzl
20.1 Introduction 525
20.2 The Routing Problem 526
20.3 Genetic Algorithms and Routing 532
20.4 Genetic Routing Protocol Design 536
20.5 Genetic Routing Protocol Implementation 547
20.6 Results and Analysis 552
20.7 Conclusions 560
References 561
21. Reconfigurable Computing for Cryptography 563
Hassan B. Diab
21.1 Introduction 564
21.2 Reconfigurable Computing 565
21.3 AES Cryptography 576
21.4 Case Study: The Twofish Cipher on a Dynamic RC System 579
21.5 Future of RC 589
21.6 Conclusion 590
References 591
22. Dependability of Reconfigurable Computing 597
Mohamed Younis, I-Hong Yeh, Nicholas Kyriakopoulos, Nikitas Alexandridis,
and Tarek El-Ghazawi
22.1 FPGA Preliminaries 598
22.2 FPGA Fault Taxonomy 603
22.3 Handling FPGA Failures 608
22.4 Conclusion and Open Issues 621
References 622
Index 627
Contributors xxxv
Acknowledgments xxxix
Part I Models and Paradigms 1
1. Formal Verification Techniques for Digital Systems 3
Masahiro Fujita, Satoshi Komatsu, and Hiroshi Saito
1.1 Introduction 3
1.2 Basic Techniques for Formal Verification 4
1.3 Verification Techniques for Combinational Circuit Equivalence 7
1.4 Verification Techniques for Sequential Circuits 14
1.5 Summary 24
References 24
2. Tolerating Arbitrary Failures With State Machine Replication 27
Assia Doudou, Benoît Garbinato, and Rachid Guerraoui
2.1 Introduction 27
2.2 System Model 31
2.3 Total Order Broadcast 32
2.4 Weak Interactive Consistency 36
2.5 Muteness Failure Detector 44
2.6 Concluding Remarks 52
References 55
3. Model-Based Evaluation as a Support to the Design of Dependable Systems
57
Andrea Bondavalli, Silvano Chiaradonna, and Felicita di Giandomenico
3.1 Introduction 57
3.2 The Role of Model-Based Evaluation in the Development of Dependable
Systems 58
3.3 Dependability Modeling Methodologies and Tools 61
3.4 Analytical Modeling to Support Design Decisions 68
3.5 Analytical Modeling to Support Fault Removal During Operational Life 76
3.6 Summary 82
References 82
4. Voting: A Paradigm for Adjudication and Data Fusion in Dependable
Systems 87
Behrooz Parhami
4.1 Introduction 87
4.2 Voting in Dependable Systems 88
4.3 Voting Schemes and Problems 94
4.4 Voting for Data Fusion 98
4.5 Implementation Issues 102
4.6 Unifying Concepts 107
4.7 Conclusion 110
References 111
5. Robust Communication Primitives for Wireless Sensor Networks 115
Amol Bakshi and Viktor K. Prasanna
5.1 Introduction 115
5.2 Defining Realistic Models 117
5.3 Our System Model 119
5.4 Permutation Routing in a Single-hop Topology: State-of-the-Art 121
5.5 An Energy-Efficient Protocol Using a Low-Power Control Channel 125
5.6 Our Routing Protocol for a Faulty Network 132
5.7 Our Generalized Protocol for a Multichannel Network 135
5.8 Concluding Remarks 140
References 140
6. System-Level Diagnosis and Implications in Current Context 143
Arun K. Somani
6.1 Issues in Large and Complex Computing Systems 143
6.2 System-Level Diagnosis 145
6.3 Classification of Diagnosable Systems 148
6.4 Diagnosability Algorithms 157
6.5 Diagnosis Algorithms 160
6.6 Application of System-Level Diagnosis Algorithm 165
6.7 Summary and Conclusions 166
References 167
7. Predicate Detection in Asynchronous Systems With Crash Failures 171
Felix C. Gärtner and Stefan Pleisch
7.1 Introduction 171
7.2 Predicate Detection in Fault-Free Environments 173
7.3 Failures and Failure Detection 177
7.4 Predicate Detection in Faulty Environments 183
7.5 Solving Predicate Detection in Faulty Environments 194
7.6 Conclusion 209
References 211
8. Fault Tolerance Against Design Faults 213
Lorenzo Strigini
8.1 Introduction 213
8.2 Examples and Principles 215
8.3 Potential and Actual Benefits 225
8.4 Design Solutions 230
8.5 Summary 236
References 238
9. Formal Methods for Safety Critical Systems 243
Ali E. Abdallah, Jonathan P. Bowen, and Nimal Nissanke
9.1 Introduction 243
9.2 Specification of Safety 245
9.3 Historical Background 247
9.4 Safety 248
9.5 Application Areas 253
9.6 Specification Framework 256
9.7 System State and Behavior 262
9.8 Discussion 265
9.9 Conclusion 268
References 269
Part II Enabling Technologies and Applications 273
10. Dependability Support in Wireless Sensor Networks 275
Denis Gracanin, Mohamed Eltoweissy, Stephan Olariu, and Ashraf Wadaa
10.1 Motivation and Background 276
10.2 Service Centric Model 279
10.3 Conclusion 283
References 283
11. Availability Modeling in Practice 285
Kishor S. Trivedi, Archana Sathaye, and Srinivasan Ramani
11.1 Introduction 285
11.2 Modeling Approaches 286
11.3 Composite Availability and Performance Model 292
11.4 Digital Equipment Corporation Case Study 297
11.5 Conclusion 315
References 315
12. Experimental Dependability Evaluation 319
João Gabriel Silva and Henrique Madeira
12.1 Field Measurement 321
12.2 Fault Injection 323
12.3 Robustness Testing 337
12.4 Recent Developments: Dependability Benchmarking 340
12.5 Conclusion 342
References 343
13. A Dependable Architecture for Telemedicine in Support of Disaster
Relief 349
Stephan Olariu, Kurt Maly, Edwin C. Foudriat, Sameh M. Yamany, and Thomas
Luckenbach
13.1 Introduction 349
13.2 Telemedicine-State of the Art 350
13.3 The WIRM System Architecture 352
13.4 A Novel 3D Data Compression Technique 356
13.5 Interactive Remote Visualization 358
13.6 An Overview of H3M-Our Wireless Architecture 359
13.7 Concluding Remarks 366
References 366
14. An Overview of IBM Mainframe Dependable Computing: From System/360 to
Series 369
Lisa Spainhower
14.1 Introduction 369
14.2 Error Detection and Fault Isolation 375
14.3 Instruction Level Retry 380
14.4 Online Repair 386
14.5 Summary 391
References 392
15. Tracking the Propagation of Data Errors in Software 395
Martin Hiller, Arshad Jhumka, and Neeraj Suri
15.1 Introduction 395
15.2 Target System Model 396
15.3 Overview of the Tool Suite 397
15.4 Setup: Experiment Design and Target Instrumentation 401
15.5 Injection: Running Experiments 407
15.6 Analysis: Obtaining Error Propagation Characteristics 408
15.7 Example Results Generated by Propane 409
15.8 Propane's Attributes and Main Characteristics 414
15.9 Summary 415
References 416
16. Integrated Reliable Real-Time Systems 419
Mohamed Younis
16.1 Background 421
16.2 Integration Issues 425
16.3 Few Forward Steps 429
16.4 An Example Aerospace Application 432
16.5 Conclusion 442
References 443
17. Network Resilience by Emergent Behavior from Simple Autonomous Agents
449
Bjarne E. Helvik and Otto Wittner
17.1 Introduction 449
17.2 Network Resilience 450
17.3 Handling Routing and Resources in Networks by Emergence 457
17.4 Cross-Entropy Based Path Finding 460
17.5 Finding "Best-Effort" Primary/Backup Paths 468
17.6 Discussion 473
17.7 Concluding Remarks 475
References 475
18. Safeguarding Critical Infrastructures 479
David Gamez, Simin Nadjm-Tehrani, John Bigham, Claudio Balducelli, Kalle
Burbeck, and Tobias Chyssler
18.1 Introduction 479
18.2 Attacks, Failures, and Accidents 480
18.3 Solutions 483
18.4 The Safeguard Architecture 486
18.5 Future Work 497
18.6 Conclusion 497
References 498
19. Impact of Traffic Self-Similarity on the Performance of Routing
Algorithms in Multicomputer Systems 501
Geyong Min, Mohamed Ould-Khaoua, Demetres D. Kouvatsos, and Irfan U. Awan
19.1 Introduction 502
19.2 The k-ary n-Cube and Dimension-Ordered Routing 504
19.3 Modeling of Traffic Self-Similarity 506
19.4 The Analytical Model 507
19.5 Impact of Self-Similar Traffic on Routing Performance 518
19.6 Conclusions 519
References 520
Appendix 19.1: Notation 523
20. Some Observations on Adaptive Meta-Heuristics for Routing in Datagram
Networks 525
Albert Y. Zomaya, Tysun Chan, and Miro Kraetzl
20.1 Introduction 525
20.2 The Routing Problem 526
20.3 Genetic Algorithms and Routing 532
20.4 Genetic Routing Protocol Design 536
20.5 Genetic Routing Protocol Implementation 547
20.6 Results and Analysis 552
20.7 Conclusions 560
References 561
21. Reconfigurable Computing for Cryptography 563
Hassan B. Diab
21.1 Introduction 564
21.2 Reconfigurable Computing 565
21.3 AES Cryptography 576
21.4 Case Study: The Twofish Cipher on a Dynamic RC System 579
21.5 Future of RC 589
21.6 Conclusion 590
References 591
22. Dependability of Reconfigurable Computing 597
Mohamed Younis, I-Hong Yeh, Nicholas Kyriakopoulos, Nikitas Alexandridis,
and Tarek El-Ghazawi
22.1 FPGA Preliminaries 598
22.2 FPGA Fault Taxonomy 603
22.3 Handling FPGA Failures 608
22.4 Conclusion and Open Issues 621
References 622
Index 627