Zachary Taylor, Subramanyam Ranganathan
Designing High Availability Systems
Dfss and Classical Reliability Techniques with Practical Real Life Examples
Zachary Taylor, Subramanyam Ranganathan
Designing High Availability Systems
Dfss and Classical Reliability Techniques with Practical Real Life Examples
- Gebundenes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
A practical, step-by-step guide to designing world-class, high availability systems using both classical and DFSS reliability techniques
Whether designing telecom, aerospace, automotive, medical, financial, or public safety systems, every engineer aims for the utmost reliability and availability in the systems he, or she, designs. But between the dream of world-class performance and reality falls the shadow of complexities that can bedevil even the most rigorous design process. While there are an array of robust predictive engineering tools, there has been no single-source guide to…mehr
Andere Kunden interessierten sich auch für
- Kailash C. KapurReliability Engineering160,99 €
- Liudong XingBinary Decision Diagrams and Extensions for System Reliability Analysis199,99 €
- Design for Reliability128,99 €
- Norman PascoeReliability Technology121,99 €
- Way KuoImportance Measures in Reliability, Risk, and Optimization141,99 €
- Stuart JacobsEngineering Information Security147,99 €
- Nazzareno RossettiManaging Power Electronics178,99 €
-
-
-
A practical, step-by-step guide to designing world-class, high availability systems using both classical and DFSS reliability techniques
Whether designing telecom, aerospace, automotive, medical, financial, or public safety systems, every engineer aims for the utmost reliability and availability in the systems he, or she, designs. But between the dream of world-class performance and reality falls the shadow of complexities that can bedevil even the most rigorous design process. While there are an array of robust predictive engineering tools, there has been no single-source guide to understanding and using them . . . until now.
Offering a case-based approach to designing, predicting, and deploying world-class high-availability systems from the ground up, this book brings together the best classical and DFSS reliability techniques. Although it focuses on technical aspects, this guide considers the business and market constraints that require that systems be designed right the first time.
Written in plain English and following a step-by-step "cookbook" format, Designing High Availability Systems:
Shows how to integrate an array of design/analysis tools, including Six Sigma, Failure Analysis, and Reliability Analysis
Features many real-life examples and case studies describing predictive design methods, tradeoffs, risk priorities, "what-if" scenarios, and more
Delivers numerous high-impact takeaways that you can apply to your current projects immediately
Provides access to MATLAB programs for simulating problem sets presented, along with PowerPoint slides to assist in outlining the problem-solving process
Designing High Availability Systems is an indispensable working resource for system engineers, software/hardware architects, and project teams working in all industries.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Whether designing telecom, aerospace, automotive, medical, financial, or public safety systems, every engineer aims for the utmost reliability and availability in the systems he, or she, designs. But between the dream of world-class performance and reality falls the shadow of complexities that can bedevil even the most rigorous design process. While there are an array of robust predictive engineering tools, there has been no single-source guide to understanding and using them . . . until now.
Offering a case-based approach to designing, predicting, and deploying world-class high-availability systems from the ground up, this book brings together the best classical and DFSS reliability techniques. Although it focuses on technical aspects, this guide considers the business and market constraints that require that systems be designed right the first time.
Written in plain English and following a step-by-step "cookbook" format, Designing High Availability Systems:
Shows how to integrate an array of design/analysis tools, including Six Sigma, Failure Analysis, and Reliability Analysis
Features many real-life examples and case studies describing predictive design methods, tradeoffs, risk priorities, "what-if" scenarios, and more
Delivers numerous high-impact takeaways that you can apply to your current projects immediately
Provides access to MATLAB programs for simulating problem sets presented, along with PowerPoint slides to assist in outlining the problem-solving process
Designing High Availability Systems is an indispensable working resource for system engineers, software/hardware architects, and project teams working in all industries.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Verlag: Wiley & Sons
- 1. Auflage
- Seitenzahl: 480
- Erscheinungstermin: 5. November 2013
- Englisch
- Abmessung: 240mm x 161mm x 30mm
- Gewicht: 883g
- ISBN-13: 9781118551127
- ISBN-10: 1118551125
- Artikelnr.: 36879740
- Verlag: Wiley & Sons
- 1. Auflage
- Seitenzahl: 480
- Erscheinungstermin: 5. November 2013
- Englisch
- Abmessung: 240mm x 161mm x 30mm
- Gewicht: 883g
- ISBN-13: 9781118551127
- ISBN-10: 1118551125
- Artikelnr.: 36879740
A wizard with words and a man of the bush. Aaron is a 25-year-old father of two and a beloved husband. He has spent five years assisting the bush to regenerate in the South Coast of New South Wales, and spent far too much of that time daydreaming and fantasizing. The Wizard and the Birdsong is the first fruit of his laborious fantasizing, and a gift he hopes will be read by families all over the world to help honour old ways of community-mindedness and caring for our loved ones.
Preface xiii List of Abbreviations xvii 1. Introduction 1 2. Initial
Considerations for Reliability Design 3 2.1 The Challenge 3 2.2 Initial
Data Collection 3 2.3 Where Do We Get MTBF Information? 5 2.4 MTTR and
Identifying Failures 6 2.5 Summary 7 3. A Game of Dice: An Introduction to
Probability 8 3.1 Introduction 8 3.2 A Game of Dice 10 3.3 Mutually
Exclusive and Independent Events 10 3.4 Dice Paradox Problem and
Conditional Probability 15 3.5 Flip a Coin 21 3.6 Dice Paradox Revisited 23
3.7 Probabilities for Multiple Dice Throws 24 3.8 Conditional Probability
Revisited 27 3.9 Summary 29 4. Discrete Random Variables 30 4.1
Introduction 30 4.2 Random Variables 31 4.3 Discrete Probability
Distributions 33 4.4 Bernoulli Distribution 34 4.5 Geometric Distribution
35 4.6 Binomial Coeffi cients 38 4.7 Binomial Distribution 40 4.8 Poisson
Distribution 43 4.9 Negative Binomial Random Variable 48 4.10 Summary 50 5.
Continuous Random Variables 51 5.1 Introduction 51 5.2 Uniform Random
Variables 52 5.3 Exponential Random Variables 53 5.4 Weibull Random
Variables 54 5.5 Gamma Random Variables 55 5.6 Chi-Square Random Variables
59 5.7 Normal Random Variables 59 5.8 Relationship between Random Variables
60 5.9 Summary 61 6. Random Processes 62 6.1 Introduction 62 6.2 Markov
Process 63 6.3 Poisson Process 63 6.4 Deriving the Poisson Distribution 64
6.5 Poisson Interarrival Times 69 6.6 Summary 71 7. Modeling and
Reliability Basics 72 7.1 Introduction 72 7.2 Modeling 75 7.3 Failure
Probability and Failure Density 77 7.4 Unreliability, F(t) 78 7.5
Reliability, R(t) 79 7.6 MTTF 79 7.7 MTBF 79 7.8 Repairable System 80 7.9
Nonrepairable System 80 7.10 MTTR 80 7.11 Failure Rate 81 7.12
Maintainability 81 7.13 Operability 81 7.14 Availability 82 7.15
Unavailability 84 7.16 Five 9s Availability 85 7.17 Downtime 85 7.18
Constant Failure Rate Model 85 7.19 Conditional Failure Rate 88 7.20
Bayes's Theorem 94 7.21 Reliability Block Diagrams 98 7.22 Summary 107 8.
Discrete-Time Markov Analysis 110 8.1 Introduction 110 8.2 Markov Process
Defined 112 8.3 Dynamic Modeling 116 8.4 Discrete Time Markov Chains 116
8.5 Absorbing Markov Chains 123 8.6 Nonrepairable Reliability Models 129
8.7 Summary 140 9. Continuous-Time Markov Systems 141 9.1 Introduction 141
9.2 Continuous-Time Markov Processes 141 9.3 Two-State Derivation 143 9.4
Steps to Create a Markov Reliability Model 147 9.5 Asymptotic Behavior
(Steady-State Behavior) 148 9.6 Limitations of Markov Modeling 154 9.7
Markov Reward Models 154 9.8 Summary 155 10. Markov Analysis: Nonrepairable
Systems 156 10.1 Introduction 156 10.2 One Component, No Repair 156 10.3
Nonrepairable Systems: Parallel System with No Repair 165 10.4 Series
System with No Repair: Two Identical Components 172 10.5 Parallel System
with Partial Repair: Identical Components 176 10.6 Parallel System with No
Repair: Nonidentical Components 183 10.7 Summary 192 11. Markov Analysis:
Repairable Systems 193 11.1 Repairable Systems 193 11.2 One Component with
Repair 194 11.3 Parallel System with Repair: Identical Component Failure
and Repair Rates 204 11.4 Parallel System with Repair: Different Failure
and Repair Rates 217 11.5 Summary 239 12. Analyzing Confidence Levels 240
12.1 Introduction 240 12.2 pdf of a Squared Normal Random Variable 240 12.3
pdf of the Sum of Two Random Variables 243 12.4 pdf of the Sum of Two Gamma
Random Variables 245 12.5 pdf of the Sum of n Gamma Random Variables 246
12.6 Goodness-of-Fit Test Using Chi-Square 249 12.7 Confidence Levels 257
12.8 Summary 264 13. Estimating Reliability Parameters 266 13.1
Introduction 266 13.2 Bayes' Estimation 268 13.3 Example of Estimating
Hardware MTBF 273 13.4 Estimating Software MTBF 273 13.5 Revising Initial
MTBF Estimates and Tradeoffs 274 13.6 Summary 277 14. Six Sigma Tools for
Predictive Engineering 278 14.1 Introduction 278 14.2 Gathering Voice of
Customer (VOC) 279 14.3 Processing Voice of Customer 281 14.4 Kano Analysis
282 14.5 Analysis of Technical Risks 284 14.6 Quality Function Deployment
(QFD) or House of Quality 284 14.7 Program Level Transparency of Critical
Parameters 287 14.8 Mapping DFSS Techniques to Critical Parameters 287 14.9
Critical Parameter Management (CPM) 287 14.10 First Principles Modeling 289
14.11 Design of Experiments (DOE) 289 14.12 Design Failure Modes and
Effects Analysis (DFMEA) 289 14.13 Fault Tree Analysis 290 14.14 Pugh
Matrix 290 14.15 Monte Carlo Simulation 291 14.16 Commercial DFSS Tools 291
14.17 Mathematical Prediction of System Capability instead of "Gut Feel"
293 14.18 Visualizing System Behavior Early in the Life Cycle 297 14.19
Critical Parameter Scorecard 297 14.20 Applying DFSS in Third-Party
Intensive Programs 298 14.21 Summary 300 15. Design Failure Modes and
Effects Analysis 302 15.1 Introduction 302 15.2 What Is Design Failure
Modes and Effects Analysis (DFMEA)? 302 15.3 Definitions 303 15.4 Business
Case for DFMEA 303 15.5 Why Conduct DFMEA? 305 15.6 When to Perform DFMEA
305 15.7 Applicability of DFMEA 306 15.8 DFMEA Template 306 15.9 DFMEA Life
Cycle 312 15.10 The DFMEA Team 324 15.11 DFMEA Advantages and Disadvantages
327 15.12 Limitations of DFMEA 328 15.13 DFMEAs, FTAs, and Reliability
Analysis 328 15.14 Summary 330 16. Fault Tree Analysis 331 16.1 What Is
Fault Tree Analysis? 331 16.2 Events 332 16.3 Logic Gates 333 16.4 Creating
a Fault Tree 335 16.5 Fault Tree Limitations 339 16.6 Summary 339 17. Monte
Carlo Simulation Models 340 17.1 Introduction 340 17.2 System Behavior over
Mission Time 344 17.3 Reliability Parameter Analysis 344 17.4 A Worked
Example 348 17.5 Component and System Failure Times Using Monte Carlo
Simulations 359 17.6 Limitations of Using Nontime-Based Monte Carlo
Simulations 361 17.7 Summary 365 18. Updating Reliability Estimates: Case
Study 367 18.1 Introduction 367 18.2 Overview of the Base Station
Controller--Data Only (BSC-DO) System 367 18.3 Downtime Calculation 368
18.4 Calculating Availability from Field Data Only 371 18.5 Assumptions
Behind Using the Chi-Square Methodology 372 18.6 Fault Tree Updates from
Field Data 372 18.7 Summary 376 19. Fault Management Architectures 377 19.1
Introduction 377 19.2 Faults, Errors, and Failures 378 19.3 Fault
Management Design 381 19.4 Repair versus Recovery 382 19.5 Design
Considerations for Reliability Modeling 383 19.6 Architecture Techniques to
Improve Availability 383 19.7 Redundancy Schemes 384 19.8 Summary 395 20
Application of DFMEA to Real-Life Example 397 20.1 Introduction 397 20.2
Cage Failover Architecture Description 397 20.3 Cage Failover DFMEA Example
399 20.4 DFMEA Scorecard 401 20.5 Lessons Learned 402 20.6 Summary 403 21.
Application of FTA to Real-Life Example 404 21.1 Introduction 404 21.2
Calculating Availability Using Fault Tree Analysis 404 21.3 Building the
Basic Events 405 21.4 Building the Fault Tree 406 21.5 Steps for Creating
and Estimating the Availability Using FTA 408 21.6 Summary 416 22. Complex
High Availability System Analysis 420 22.1 Introduction 420 22.2 Markov
Analysis of the Hardware Components 420 22.3 Building a Fault Tree from the
Hardware Markov Model 427 22.4 Markov Analysis of the Software Components
427 22.5 Markov Analysis of the Combined Hardware and Software Components
433 22.6 Techniques for Simplifying Markov Analysis 437 22.7 Summary 446
References 447 Index 450
Considerations for Reliability Design 3 2.1 The Challenge 3 2.2 Initial
Data Collection 3 2.3 Where Do We Get MTBF Information? 5 2.4 MTTR and
Identifying Failures 6 2.5 Summary 7 3. A Game of Dice: An Introduction to
Probability 8 3.1 Introduction 8 3.2 A Game of Dice 10 3.3 Mutually
Exclusive and Independent Events 10 3.4 Dice Paradox Problem and
Conditional Probability 15 3.5 Flip a Coin 21 3.6 Dice Paradox Revisited 23
3.7 Probabilities for Multiple Dice Throws 24 3.8 Conditional Probability
Revisited 27 3.9 Summary 29 4. Discrete Random Variables 30 4.1
Introduction 30 4.2 Random Variables 31 4.3 Discrete Probability
Distributions 33 4.4 Bernoulli Distribution 34 4.5 Geometric Distribution
35 4.6 Binomial Coeffi cients 38 4.7 Binomial Distribution 40 4.8 Poisson
Distribution 43 4.9 Negative Binomial Random Variable 48 4.10 Summary 50 5.
Continuous Random Variables 51 5.1 Introduction 51 5.2 Uniform Random
Variables 52 5.3 Exponential Random Variables 53 5.4 Weibull Random
Variables 54 5.5 Gamma Random Variables 55 5.6 Chi-Square Random Variables
59 5.7 Normal Random Variables 59 5.8 Relationship between Random Variables
60 5.9 Summary 61 6. Random Processes 62 6.1 Introduction 62 6.2 Markov
Process 63 6.3 Poisson Process 63 6.4 Deriving the Poisson Distribution 64
6.5 Poisson Interarrival Times 69 6.6 Summary 71 7. Modeling and
Reliability Basics 72 7.1 Introduction 72 7.2 Modeling 75 7.3 Failure
Probability and Failure Density 77 7.4 Unreliability, F(t) 78 7.5
Reliability, R(t) 79 7.6 MTTF 79 7.7 MTBF 79 7.8 Repairable System 80 7.9
Nonrepairable System 80 7.10 MTTR 80 7.11 Failure Rate 81 7.12
Maintainability 81 7.13 Operability 81 7.14 Availability 82 7.15
Unavailability 84 7.16 Five 9s Availability 85 7.17 Downtime 85 7.18
Constant Failure Rate Model 85 7.19 Conditional Failure Rate 88 7.20
Bayes's Theorem 94 7.21 Reliability Block Diagrams 98 7.22 Summary 107 8.
Discrete-Time Markov Analysis 110 8.1 Introduction 110 8.2 Markov Process
Defined 112 8.3 Dynamic Modeling 116 8.4 Discrete Time Markov Chains 116
8.5 Absorbing Markov Chains 123 8.6 Nonrepairable Reliability Models 129
8.7 Summary 140 9. Continuous-Time Markov Systems 141 9.1 Introduction 141
9.2 Continuous-Time Markov Processes 141 9.3 Two-State Derivation 143 9.4
Steps to Create a Markov Reliability Model 147 9.5 Asymptotic Behavior
(Steady-State Behavior) 148 9.6 Limitations of Markov Modeling 154 9.7
Markov Reward Models 154 9.8 Summary 155 10. Markov Analysis: Nonrepairable
Systems 156 10.1 Introduction 156 10.2 One Component, No Repair 156 10.3
Nonrepairable Systems: Parallel System with No Repair 165 10.4 Series
System with No Repair: Two Identical Components 172 10.5 Parallel System
with Partial Repair: Identical Components 176 10.6 Parallel System with No
Repair: Nonidentical Components 183 10.7 Summary 192 11. Markov Analysis:
Repairable Systems 193 11.1 Repairable Systems 193 11.2 One Component with
Repair 194 11.3 Parallel System with Repair: Identical Component Failure
and Repair Rates 204 11.4 Parallel System with Repair: Different Failure
and Repair Rates 217 11.5 Summary 239 12. Analyzing Confidence Levels 240
12.1 Introduction 240 12.2 pdf of a Squared Normal Random Variable 240 12.3
pdf of the Sum of Two Random Variables 243 12.4 pdf of the Sum of Two Gamma
Random Variables 245 12.5 pdf of the Sum of n Gamma Random Variables 246
12.6 Goodness-of-Fit Test Using Chi-Square 249 12.7 Confidence Levels 257
12.8 Summary 264 13. Estimating Reliability Parameters 266 13.1
Introduction 266 13.2 Bayes' Estimation 268 13.3 Example of Estimating
Hardware MTBF 273 13.4 Estimating Software MTBF 273 13.5 Revising Initial
MTBF Estimates and Tradeoffs 274 13.6 Summary 277 14. Six Sigma Tools for
Predictive Engineering 278 14.1 Introduction 278 14.2 Gathering Voice of
Customer (VOC) 279 14.3 Processing Voice of Customer 281 14.4 Kano Analysis
282 14.5 Analysis of Technical Risks 284 14.6 Quality Function Deployment
(QFD) or House of Quality 284 14.7 Program Level Transparency of Critical
Parameters 287 14.8 Mapping DFSS Techniques to Critical Parameters 287 14.9
Critical Parameter Management (CPM) 287 14.10 First Principles Modeling 289
14.11 Design of Experiments (DOE) 289 14.12 Design Failure Modes and
Effects Analysis (DFMEA) 289 14.13 Fault Tree Analysis 290 14.14 Pugh
Matrix 290 14.15 Monte Carlo Simulation 291 14.16 Commercial DFSS Tools 291
14.17 Mathematical Prediction of System Capability instead of "Gut Feel"
293 14.18 Visualizing System Behavior Early in the Life Cycle 297 14.19
Critical Parameter Scorecard 297 14.20 Applying DFSS in Third-Party
Intensive Programs 298 14.21 Summary 300 15. Design Failure Modes and
Effects Analysis 302 15.1 Introduction 302 15.2 What Is Design Failure
Modes and Effects Analysis (DFMEA)? 302 15.3 Definitions 303 15.4 Business
Case for DFMEA 303 15.5 Why Conduct DFMEA? 305 15.6 When to Perform DFMEA
305 15.7 Applicability of DFMEA 306 15.8 DFMEA Template 306 15.9 DFMEA Life
Cycle 312 15.10 The DFMEA Team 324 15.11 DFMEA Advantages and Disadvantages
327 15.12 Limitations of DFMEA 328 15.13 DFMEAs, FTAs, and Reliability
Analysis 328 15.14 Summary 330 16. Fault Tree Analysis 331 16.1 What Is
Fault Tree Analysis? 331 16.2 Events 332 16.3 Logic Gates 333 16.4 Creating
a Fault Tree 335 16.5 Fault Tree Limitations 339 16.6 Summary 339 17. Monte
Carlo Simulation Models 340 17.1 Introduction 340 17.2 System Behavior over
Mission Time 344 17.3 Reliability Parameter Analysis 344 17.4 A Worked
Example 348 17.5 Component and System Failure Times Using Monte Carlo
Simulations 359 17.6 Limitations of Using Nontime-Based Monte Carlo
Simulations 361 17.7 Summary 365 18. Updating Reliability Estimates: Case
Study 367 18.1 Introduction 367 18.2 Overview of the Base Station
Controller--Data Only (BSC-DO) System 367 18.3 Downtime Calculation 368
18.4 Calculating Availability from Field Data Only 371 18.5 Assumptions
Behind Using the Chi-Square Methodology 372 18.6 Fault Tree Updates from
Field Data 372 18.7 Summary 376 19. Fault Management Architectures 377 19.1
Introduction 377 19.2 Faults, Errors, and Failures 378 19.3 Fault
Management Design 381 19.4 Repair versus Recovery 382 19.5 Design
Considerations for Reliability Modeling 383 19.6 Architecture Techniques to
Improve Availability 383 19.7 Redundancy Schemes 384 19.8 Summary 395 20
Application of DFMEA to Real-Life Example 397 20.1 Introduction 397 20.2
Cage Failover Architecture Description 397 20.3 Cage Failover DFMEA Example
399 20.4 DFMEA Scorecard 401 20.5 Lessons Learned 402 20.6 Summary 403 21.
Application of FTA to Real-Life Example 404 21.1 Introduction 404 21.2
Calculating Availability Using Fault Tree Analysis 404 21.3 Building the
Basic Events 405 21.4 Building the Fault Tree 406 21.5 Steps for Creating
and Estimating the Availability Using FTA 408 21.6 Summary 416 22. Complex
High Availability System Analysis 420 22.1 Introduction 420 22.2 Markov
Analysis of the Hardware Components 420 22.3 Building a Fault Tree from the
Hardware Markov Model 427 22.4 Markov Analysis of the Software Components
427 22.5 Markov Analysis of the Combined Hardware and Software Components
433 22.6 Techniques for Simplifying Markov Analysis 437 22.7 Summary 446
References 447 Index 450
Preface xiii List of Abbreviations xvii 1. Introduction 1 2. Initial
Considerations for Reliability Design 3 2.1 The Challenge 3 2.2 Initial
Data Collection 3 2.3 Where Do We Get MTBF Information? 5 2.4 MTTR and
Identifying Failures 6 2.5 Summary 7 3. A Game of Dice: An Introduction to
Probability 8 3.1 Introduction 8 3.2 A Game of Dice 10 3.3 Mutually
Exclusive and Independent Events 10 3.4 Dice Paradox Problem and
Conditional Probability 15 3.5 Flip a Coin 21 3.6 Dice Paradox Revisited 23
3.7 Probabilities for Multiple Dice Throws 24 3.8 Conditional Probability
Revisited 27 3.9 Summary 29 4. Discrete Random Variables 30 4.1
Introduction 30 4.2 Random Variables 31 4.3 Discrete Probability
Distributions 33 4.4 Bernoulli Distribution 34 4.5 Geometric Distribution
35 4.6 Binomial Coeffi cients 38 4.7 Binomial Distribution 40 4.8 Poisson
Distribution 43 4.9 Negative Binomial Random Variable 48 4.10 Summary 50 5.
Continuous Random Variables 51 5.1 Introduction 51 5.2 Uniform Random
Variables 52 5.3 Exponential Random Variables 53 5.4 Weibull Random
Variables 54 5.5 Gamma Random Variables 55 5.6 Chi-Square Random Variables
59 5.7 Normal Random Variables 59 5.8 Relationship between Random Variables
60 5.9 Summary 61 6. Random Processes 62 6.1 Introduction 62 6.2 Markov
Process 63 6.3 Poisson Process 63 6.4 Deriving the Poisson Distribution 64
6.5 Poisson Interarrival Times 69 6.6 Summary 71 7. Modeling and
Reliability Basics 72 7.1 Introduction 72 7.2 Modeling 75 7.3 Failure
Probability and Failure Density 77 7.4 Unreliability, F(t) 78 7.5
Reliability, R(t) 79 7.6 MTTF 79 7.7 MTBF 79 7.8 Repairable System 80 7.9
Nonrepairable System 80 7.10 MTTR 80 7.11 Failure Rate 81 7.12
Maintainability 81 7.13 Operability 81 7.14 Availability 82 7.15
Unavailability 84 7.16 Five 9s Availability 85 7.17 Downtime 85 7.18
Constant Failure Rate Model 85 7.19 Conditional Failure Rate 88 7.20
Bayes's Theorem 94 7.21 Reliability Block Diagrams 98 7.22 Summary 107 8.
Discrete-Time Markov Analysis 110 8.1 Introduction 110 8.2 Markov Process
Defined 112 8.3 Dynamic Modeling 116 8.4 Discrete Time Markov Chains 116
8.5 Absorbing Markov Chains 123 8.6 Nonrepairable Reliability Models 129
8.7 Summary 140 9. Continuous-Time Markov Systems 141 9.1 Introduction 141
9.2 Continuous-Time Markov Processes 141 9.3 Two-State Derivation 143 9.4
Steps to Create a Markov Reliability Model 147 9.5 Asymptotic Behavior
(Steady-State Behavior) 148 9.6 Limitations of Markov Modeling 154 9.7
Markov Reward Models 154 9.8 Summary 155 10. Markov Analysis: Nonrepairable
Systems 156 10.1 Introduction 156 10.2 One Component, No Repair 156 10.3
Nonrepairable Systems: Parallel System with No Repair 165 10.4 Series
System with No Repair: Two Identical Components 172 10.5 Parallel System
with Partial Repair: Identical Components 176 10.6 Parallel System with No
Repair: Nonidentical Components 183 10.7 Summary 192 11. Markov Analysis:
Repairable Systems 193 11.1 Repairable Systems 193 11.2 One Component with
Repair 194 11.3 Parallel System with Repair: Identical Component Failure
and Repair Rates 204 11.4 Parallel System with Repair: Different Failure
and Repair Rates 217 11.5 Summary 239 12. Analyzing Confidence Levels 240
12.1 Introduction 240 12.2 pdf of a Squared Normal Random Variable 240 12.3
pdf of the Sum of Two Random Variables 243 12.4 pdf of the Sum of Two Gamma
Random Variables 245 12.5 pdf of the Sum of n Gamma Random Variables 246
12.6 Goodness-of-Fit Test Using Chi-Square 249 12.7 Confidence Levels 257
12.8 Summary 264 13. Estimating Reliability Parameters 266 13.1
Introduction 266 13.2 Bayes' Estimation 268 13.3 Example of Estimating
Hardware MTBF 273 13.4 Estimating Software MTBF 273 13.5 Revising Initial
MTBF Estimates and Tradeoffs 274 13.6 Summary 277 14. Six Sigma Tools for
Predictive Engineering 278 14.1 Introduction 278 14.2 Gathering Voice of
Customer (VOC) 279 14.3 Processing Voice of Customer 281 14.4 Kano Analysis
282 14.5 Analysis of Technical Risks 284 14.6 Quality Function Deployment
(QFD) or House of Quality 284 14.7 Program Level Transparency of Critical
Parameters 287 14.8 Mapping DFSS Techniques to Critical Parameters 287 14.9
Critical Parameter Management (CPM) 287 14.10 First Principles Modeling 289
14.11 Design of Experiments (DOE) 289 14.12 Design Failure Modes and
Effects Analysis (DFMEA) 289 14.13 Fault Tree Analysis 290 14.14 Pugh
Matrix 290 14.15 Monte Carlo Simulation 291 14.16 Commercial DFSS Tools 291
14.17 Mathematical Prediction of System Capability instead of "Gut Feel"
293 14.18 Visualizing System Behavior Early in the Life Cycle 297 14.19
Critical Parameter Scorecard 297 14.20 Applying DFSS in Third-Party
Intensive Programs 298 14.21 Summary 300 15. Design Failure Modes and
Effects Analysis 302 15.1 Introduction 302 15.2 What Is Design Failure
Modes and Effects Analysis (DFMEA)? 302 15.3 Definitions 303 15.4 Business
Case for DFMEA 303 15.5 Why Conduct DFMEA? 305 15.6 When to Perform DFMEA
305 15.7 Applicability of DFMEA 306 15.8 DFMEA Template 306 15.9 DFMEA Life
Cycle 312 15.10 The DFMEA Team 324 15.11 DFMEA Advantages and Disadvantages
327 15.12 Limitations of DFMEA 328 15.13 DFMEAs, FTAs, and Reliability
Analysis 328 15.14 Summary 330 16. Fault Tree Analysis 331 16.1 What Is
Fault Tree Analysis? 331 16.2 Events 332 16.3 Logic Gates 333 16.4 Creating
a Fault Tree 335 16.5 Fault Tree Limitations 339 16.6 Summary 339 17. Monte
Carlo Simulation Models 340 17.1 Introduction 340 17.2 System Behavior over
Mission Time 344 17.3 Reliability Parameter Analysis 344 17.4 A Worked
Example 348 17.5 Component and System Failure Times Using Monte Carlo
Simulations 359 17.6 Limitations of Using Nontime-Based Monte Carlo
Simulations 361 17.7 Summary 365 18. Updating Reliability Estimates: Case
Study 367 18.1 Introduction 367 18.2 Overview of the Base Station
Controller--Data Only (BSC-DO) System 367 18.3 Downtime Calculation 368
18.4 Calculating Availability from Field Data Only 371 18.5 Assumptions
Behind Using the Chi-Square Methodology 372 18.6 Fault Tree Updates from
Field Data 372 18.7 Summary 376 19. Fault Management Architectures 377 19.1
Introduction 377 19.2 Faults, Errors, and Failures 378 19.3 Fault
Management Design 381 19.4 Repair versus Recovery 382 19.5 Design
Considerations for Reliability Modeling 383 19.6 Architecture Techniques to
Improve Availability 383 19.7 Redundancy Schemes 384 19.8 Summary 395 20
Application of DFMEA to Real-Life Example 397 20.1 Introduction 397 20.2
Cage Failover Architecture Description 397 20.3 Cage Failover DFMEA Example
399 20.4 DFMEA Scorecard 401 20.5 Lessons Learned 402 20.6 Summary 403 21.
Application of FTA to Real-Life Example 404 21.1 Introduction 404 21.2
Calculating Availability Using Fault Tree Analysis 404 21.3 Building the
Basic Events 405 21.4 Building the Fault Tree 406 21.5 Steps for Creating
and Estimating the Availability Using FTA 408 21.6 Summary 416 22. Complex
High Availability System Analysis 420 22.1 Introduction 420 22.2 Markov
Analysis of the Hardware Components 420 22.3 Building a Fault Tree from the
Hardware Markov Model 427 22.4 Markov Analysis of the Software Components
427 22.5 Markov Analysis of the Combined Hardware and Software Components
433 22.6 Techniques for Simplifying Markov Analysis 437 22.7 Summary 446
References 447 Index 450
Considerations for Reliability Design 3 2.1 The Challenge 3 2.2 Initial
Data Collection 3 2.3 Where Do We Get MTBF Information? 5 2.4 MTTR and
Identifying Failures 6 2.5 Summary 7 3. A Game of Dice: An Introduction to
Probability 8 3.1 Introduction 8 3.2 A Game of Dice 10 3.3 Mutually
Exclusive and Independent Events 10 3.4 Dice Paradox Problem and
Conditional Probability 15 3.5 Flip a Coin 21 3.6 Dice Paradox Revisited 23
3.7 Probabilities for Multiple Dice Throws 24 3.8 Conditional Probability
Revisited 27 3.9 Summary 29 4. Discrete Random Variables 30 4.1
Introduction 30 4.2 Random Variables 31 4.3 Discrete Probability
Distributions 33 4.4 Bernoulli Distribution 34 4.5 Geometric Distribution
35 4.6 Binomial Coeffi cients 38 4.7 Binomial Distribution 40 4.8 Poisson
Distribution 43 4.9 Negative Binomial Random Variable 48 4.10 Summary 50 5.
Continuous Random Variables 51 5.1 Introduction 51 5.2 Uniform Random
Variables 52 5.3 Exponential Random Variables 53 5.4 Weibull Random
Variables 54 5.5 Gamma Random Variables 55 5.6 Chi-Square Random Variables
59 5.7 Normal Random Variables 59 5.8 Relationship between Random Variables
60 5.9 Summary 61 6. Random Processes 62 6.1 Introduction 62 6.2 Markov
Process 63 6.3 Poisson Process 63 6.4 Deriving the Poisson Distribution 64
6.5 Poisson Interarrival Times 69 6.6 Summary 71 7. Modeling and
Reliability Basics 72 7.1 Introduction 72 7.2 Modeling 75 7.3 Failure
Probability and Failure Density 77 7.4 Unreliability, F(t) 78 7.5
Reliability, R(t) 79 7.6 MTTF 79 7.7 MTBF 79 7.8 Repairable System 80 7.9
Nonrepairable System 80 7.10 MTTR 80 7.11 Failure Rate 81 7.12
Maintainability 81 7.13 Operability 81 7.14 Availability 82 7.15
Unavailability 84 7.16 Five 9s Availability 85 7.17 Downtime 85 7.18
Constant Failure Rate Model 85 7.19 Conditional Failure Rate 88 7.20
Bayes's Theorem 94 7.21 Reliability Block Diagrams 98 7.22 Summary 107 8.
Discrete-Time Markov Analysis 110 8.1 Introduction 110 8.2 Markov Process
Defined 112 8.3 Dynamic Modeling 116 8.4 Discrete Time Markov Chains 116
8.5 Absorbing Markov Chains 123 8.6 Nonrepairable Reliability Models 129
8.7 Summary 140 9. Continuous-Time Markov Systems 141 9.1 Introduction 141
9.2 Continuous-Time Markov Processes 141 9.3 Two-State Derivation 143 9.4
Steps to Create a Markov Reliability Model 147 9.5 Asymptotic Behavior
(Steady-State Behavior) 148 9.6 Limitations of Markov Modeling 154 9.7
Markov Reward Models 154 9.8 Summary 155 10. Markov Analysis: Nonrepairable
Systems 156 10.1 Introduction 156 10.2 One Component, No Repair 156 10.3
Nonrepairable Systems: Parallel System with No Repair 165 10.4 Series
System with No Repair: Two Identical Components 172 10.5 Parallel System
with Partial Repair: Identical Components 176 10.6 Parallel System with No
Repair: Nonidentical Components 183 10.7 Summary 192 11. Markov Analysis:
Repairable Systems 193 11.1 Repairable Systems 193 11.2 One Component with
Repair 194 11.3 Parallel System with Repair: Identical Component Failure
and Repair Rates 204 11.4 Parallel System with Repair: Different Failure
and Repair Rates 217 11.5 Summary 239 12. Analyzing Confidence Levels 240
12.1 Introduction 240 12.2 pdf of a Squared Normal Random Variable 240 12.3
pdf of the Sum of Two Random Variables 243 12.4 pdf of the Sum of Two Gamma
Random Variables 245 12.5 pdf of the Sum of n Gamma Random Variables 246
12.6 Goodness-of-Fit Test Using Chi-Square 249 12.7 Confidence Levels 257
12.8 Summary 264 13. Estimating Reliability Parameters 266 13.1
Introduction 266 13.2 Bayes' Estimation 268 13.3 Example of Estimating
Hardware MTBF 273 13.4 Estimating Software MTBF 273 13.5 Revising Initial
MTBF Estimates and Tradeoffs 274 13.6 Summary 277 14. Six Sigma Tools for
Predictive Engineering 278 14.1 Introduction 278 14.2 Gathering Voice of
Customer (VOC) 279 14.3 Processing Voice of Customer 281 14.4 Kano Analysis
282 14.5 Analysis of Technical Risks 284 14.6 Quality Function Deployment
(QFD) or House of Quality 284 14.7 Program Level Transparency of Critical
Parameters 287 14.8 Mapping DFSS Techniques to Critical Parameters 287 14.9
Critical Parameter Management (CPM) 287 14.10 First Principles Modeling 289
14.11 Design of Experiments (DOE) 289 14.12 Design Failure Modes and
Effects Analysis (DFMEA) 289 14.13 Fault Tree Analysis 290 14.14 Pugh
Matrix 290 14.15 Monte Carlo Simulation 291 14.16 Commercial DFSS Tools 291
14.17 Mathematical Prediction of System Capability instead of "Gut Feel"
293 14.18 Visualizing System Behavior Early in the Life Cycle 297 14.19
Critical Parameter Scorecard 297 14.20 Applying DFSS in Third-Party
Intensive Programs 298 14.21 Summary 300 15. Design Failure Modes and
Effects Analysis 302 15.1 Introduction 302 15.2 What Is Design Failure
Modes and Effects Analysis (DFMEA)? 302 15.3 Definitions 303 15.4 Business
Case for DFMEA 303 15.5 Why Conduct DFMEA? 305 15.6 When to Perform DFMEA
305 15.7 Applicability of DFMEA 306 15.8 DFMEA Template 306 15.9 DFMEA Life
Cycle 312 15.10 The DFMEA Team 324 15.11 DFMEA Advantages and Disadvantages
327 15.12 Limitations of DFMEA 328 15.13 DFMEAs, FTAs, and Reliability
Analysis 328 15.14 Summary 330 16. Fault Tree Analysis 331 16.1 What Is
Fault Tree Analysis? 331 16.2 Events 332 16.3 Logic Gates 333 16.4 Creating
a Fault Tree 335 16.5 Fault Tree Limitations 339 16.6 Summary 339 17. Monte
Carlo Simulation Models 340 17.1 Introduction 340 17.2 System Behavior over
Mission Time 344 17.3 Reliability Parameter Analysis 344 17.4 A Worked
Example 348 17.5 Component and System Failure Times Using Monte Carlo
Simulations 359 17.6 Limitations of Using Nontime-Based Monte Carlo
Simulations 361 17.7 Summary 365 18. Updating Reliability Estimates: Case
Study 367 18.1 Introduction 367 18.2 Overview of the Base Station
Controller--Data Only (BSC-DO) System 367 18.3 Downtime Calculation 368
18.4 Calculating Availability from Field Data Only 371 18.5 Assumptions
Behind Using the Chi-Square Methodology 372 18.6 Fault Tree Updates from
Field Data 372 18.7 Summary 376 19. Fault Management Architectures 377 19.1
Introduction 377 19.2 Faults, Errors, and Failures 378 19.3 Fault
Management Design 381 19.4 Repair versus Recovery 382 19.5 Design
Considerations for Reliability Modeling 383 19.6 Architecture Techniques to
Improve Availability 383 19.7 Redundancy Schemes 384 19.8 Summary 395 20
Application of DFMEA to Real-Life Example 397 20.1 Introduction 397 20.2
Cage Failover Architecture Description 397 20.3 Cage Failover DFMEA Example
399 20.4 DFMEA Scorecard 401 20.5 Lessons Learned 402 20.6 Summary 403 21.
Application of FTA to Real-Life Example 404 21.1 Introduction 404 21.2
Calculating Availability Using Fault Tree Analysis 404 21.3 Building the
Basic Events 405 21.4 Building the Fault Tree 406 21.5 Steps for Creating
and Estimating the Availability Using FTA 408 21.6 Summary 416 22. Complex
High Availability System Analysis 420 22.1 Introduction 420 22.2 Markov
Analysis of the Hardware Components 420 22.3 Building a Fault Tree from the
Hardware Markov Model 427 22.4 Markov Analysis of the Software Components
427 22.5 Markov Analysis of the Combined Hardware and Software Components
433 22.6 Techniques for Simplifying Markov Analysis 437 22.7 Summary 446
References 447 Index 450