Steve Flanders
Mastering Opentelemetry and Observability
Enhancing Application and Infrastructure Performance and Avoiding Outages
Steve Flanders
Mastering Opentelemetry and Observability
Enhancing Application and Infrastructure Performance and Avoiding Outages
- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Discover how to deploy OpenTelemetry(TM) to eliminate vendor lock-in and regain control of your telemetry data In Mastering OpenTelemetry(TM) and Observability: Enhancing Application and Infrastructure Performance and Avoiding Outages, veteran software engineering leader Steve Flanders delivers a comprehensive guide to OpenTelemetry (OTel) and achieving observability. You'll learn about how OTel revolutionizes observability, providing a suite of APIs, SDKs, and tools to capture and analyze telemetry data across your infrastructure. In addition, you will understand how to leverage OTel to…mehr
Andere Kunden interessierten sich auch für
- Valliappa LakshmananGoogle Bigquery: The Definitive Guide56,99 €
- Rajesh FrancisAmazon Redshift: The Definitive Guide70,99 €
- Ethan CowanHands-On Differential Privacy67,99 €
- Cary MillsapHow to Make Things Faster48,99 €
- Mark EdmondsonLearning Google Analytics56,99 €
- Charity MajorsObservability Engineering51,99 €
- David CalaveraLinux Observability with Bpf48,99 €
-
-
-
Discover how to deploy OpenTelemetry(TM) to eliminate vendor lock-in and regain control of your telemetry data In Mastering OpenTelemetry(TM) and Observability: Enhancing Application and Infrastructure Performance and Avoiding Outages, veteran software engineering leader Steve Flanders delivers a comprehensive guide to OpenTelemetry (OTel) and achieving observability. You'll learn about how OTel revolutionizes observability, providing a suite of APIs, SDKs, and tools to capture and analyze telemetry data across your infrastructure. In addition, you will understand how to leverage OTel to better achieve observability. Find out how to navigate the complexities of observability with ease. Discover how to deploy and configure OTel, even in challenging brownfield environments. With insights from a seasoned industry expert, gain a deep understanding of observability's rise and its impact on the modern software landscape. Explore real-world use cases, deployment models, and hands-on exercises. Unlock the full potential of this technology with practical guidance on configuration and optimization. Plus, contribute to the OTel community and shape the future of observability. Whether you're new to observability or a seasoned professional, Mastering OpenTelemetry(TM) and Observability equips you with the knowledge and tools to predict and prevent downtime, outages, and failures in your enterprise infrastructure. Start your journey to observability mastery today with this can't-miss handbook for cloud architects, software developers, site reliability engineers, and DevOps and AIOps professionals.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Tech Today
- Verlag: John Wiley & Sons Inc
- Seitenzahl: 368
- Erscheinungstermin: 5. November 2024
- Englisch
- Abmessung: 234mm x 190mm x 22mm
- Gewicht: 670g
- ISBN-13: 9781394253128
- ISBN-10: 1394253125
- Artikelnr.: 70070590
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
- Tech Today
- Verlag: John Wiley & Sons Inc
- Seitenzahl: 368
- Erscheinungstermin: 5. November 2024
- Englisch
- Abmessung: 234mm x 190mm x 22mm
- Gewicht: 670g
- ISBN-13: 9781394253128
- ISBN-10: 1394253125
- Artikelnr.: 70070590
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
STEVE FLANDERS is a Senior Director of Engineering at Splunk, a Cisco company. Steve is one of the founding members of the OpenTelemetry project.
Foreword xiii
Introduction xiv
The Mastering Series xvi
Chapter 1 What Is Observability? 1
Definition 1
Background 4
Cloud Native Era 4
Monitoring Compared to Observability 5
Metadata 8
Dimensionality 9
Cardinality 9
Semantic Conventions 10
Data Sensitivity 10
Signals 10
Metrics 10
Logs 13
Traces 14
Other Signals 20
Collecting Signals 20
Instrumentation 21
Push Versus Pull Collection 22
Data Collection 23
Sampling Signals 26
Observability 27
Platforms 27
Application Performance Monitoring 28
The Bottom Line 28
Notes 30
Chapter 2 Introducing OpenTelemetry! 31
Background 31
Observability Pain Points 31
The Rise of Open Source Software 34
Introducing OpenTelemetry 35
OpenTelemetry Components 37
OpenTelemetry Concepts 48
Roadmap 50
The Bottom Line 50
Notes 51
Chapter 3 Getting Started with the Astronomy Shop 53
Background 53
Architecture 54
Prerequisites 54
Getting Started 55
Accessing the Astronomy Shop 57
Accessing Telemetry Data 57
Beyond the Basics 58
Configuring Load Generation 58
Configuring Feature Flags 59
Configuring Tests Built from Traces 60
Configuring the OTel Collector 60
Configuring OTel Instrumentation 62
Troubleshooting Astronomy Shop 62
Astronomy Shop Scenarios 63
Troubleshooting Errors 63
Troubleshooting Availability 69
Troubleshooting Performance 70
Troubleshooting Telemetry 74
The Bottom Line 75
Notes 76
Chapter 4 Understanding the OpenTelemetry Specification 77
Background 77
API Specification 79
API Definition 80
API Context 80
API Signals 81
API Implementation 82
SDK Specification 82
SDK Definition 83
SDK Signals 83
SDK Implementation 84
Data Specification 84
Data Models 86
Data Protocols 88
Data Semantic Conventions 88
Data Compatibility 89
General Specification 90
The Bottom Line 91
Notes 92
Chapter 5 Managing the OpenTelemetry Collector 93
Background 94
Deployment Modes 95
Agent Mode 96
Gateway Mode 98
Reference Architectures 100
The Basics 101
The Binary 103
Sizing 103
Components 104
Configuration 106
Receivers and Exporters 115
Processors 116
Extensions 126
Connectors 127
Observing 128
Relevant Metrics 128
Health Check Extension 131
zPages Extension 131
Troubleshooting 134
Out of Memory Crashes 134
Data Not Being Received or Exported 134
Performance Issues 135
Beyond the Basics 135
Distributions 135
Securing 137
Management 138
The Bottom Line 140
Notes 141
Chapter 6 Leveraging OpenTelemetry Instrumentation 143
Environment Setup 144
Python Trace Instrumentation 149
Automatic Instrumentation 150
Manual Instrumentation 157
Programmatic Instrumentation 163
Mixing Automatic and Manual Trace Instrumentation 166
Python Metrics Instrumentation 167
Automatic Instrumentation 168
Manual Instrumentation 169
Programmatic Instrumentation 174
Mixing Automatic and Manual Metric Instrumentation 176
Python Log Instrumentation 178
Manual Metadata Enrichment 179
Trace Correlation 181
Language Considerations 183
NET 184
Java 184
Go 184
Node js 185
Deployment Models 185
Distributions 185
The Bottom Line 186
Notes 187
Chapter 7 Adopting OpenTelemetry 189
The Basics 189
Why OTel and Why Now? 190
Where to Start? 191
General Process 192
Data Collection 193
Instrumentation 195
Production Readiness 196
Maturity Framework 197
Brownfield Deployment 198
Data Collection 198
Instrumentation 200
Dashboards and Alerts 202
Greenfield Deployment 204
Data Collection 204
Instrumentation 208
Other Considerations 208
Administration and Maintenance 208
Environments 211
Semantic Conventions 212
The Future 213
The Bottom Line 213
Notes 214
Chapter 8 The Power of Context and Correlation 215
Background 215
Context 217
OTel Context 219
Trace Context 221
Resource Context 223
Logic Context 224
Correlation 225
Time Correlation 225
Context Correlation 226
Trace Correlation 228
Metric Correlation 230
The Bottom Line 230
Notes 231
Chapter 9 Choosing an Observability Platform 233
Primary Considerations 233
Platform Capabilities 235
Marketing Versus Reality 237
Price, Cost, and Value 238
Observability Fragmentation 241
Primary Factors 242
Build, Buy, or Manage 242
Licensing, Operations, and Deployment 244
OTel Compatibility and Vendor Lock-In 244
Stakeholders and Company Culture 245
Implementation Basics 246
Administration 247
Usage 248
Maturity Framework 248
The Bottom Line 250
Notes 250
Chapter 10 Observability Antipatterns and Pitfalls 251
Telemetry Data Missteps 251
Mixing Instrumentation Libraries Scenario 253
Automatic Instrumentation Scenario 253
Custom Instrumentation Scenario 254
Component Configuration Scenario 255
Performance Overhead Scenario 255
Resource Allocation Scenario 256
Security Considerations Scenario 256
Monitoring and Maintenance Scenario 257
Observability Platform Missteps 258
Vendor Lock-in Scenario 260
Fragmented Tooling Scenario 260
Tool Fatigue Scenario 261
Inadequate Scalability Scenario 261
Data Overload Scenario 262
Company Culture Implications 264
Lack of Leadership Support Scenario 265
Resistance to Change Scenario 266
Collaboration and Alignment Scenario 266
Goals and Success Criteria Scenario 267
Standardization and Consistency Scenario 268
Incentives and Recognition Scenario 268
Feedback and Improvement Scenario 269
Prioritization Framework 270
The Bottom Line 272
Notes 273
Chapter 11 Observability at Scale 275
Understanding the Challenges 275
Volume and Velocity of Telemetry Data 276
Distributed System Complexity 278
Observability Platform Complexity 281
Infrastructure and Resource Constraints 281
Strategies for Scaling Observability 282
Elasticity, Elasticity, Elasticity! 282
Leverage Cloud Native Technologies 284
Filter, Sample, and Aggregate 286
Anomaly Detection and Predictive Analytics 290
Emerging Technologies and Methodologies 291
Best Practices for Managing Scale 292
General Recommendations 292
Instrumentation and Data Collection 293
Observability Platform 293
The Bottom Line 294
Notes 295
Chapter 12 The Future of Observability 297
Challenges and Opportunities 297
Cost 297
Complexity 299
Compliance 300
Code 301
Emerging Trends and Innovations 302
Artificial Intelligence 303
Observability as Code 304
Service Mesh 305
eBPF 306
The Future of OpenTelemetry 307
Stabilization and Expansion 308
Expanded Signal Support 308
Unified Query Language 310
Community-driven Innovation 310
The Bottom Line 311
Notes 311
Appendix A The Bottom Line 313
Chapter 1: What Is Observability? 313
Chapter 2: Introducing OpenTelemetry! 315
Chapter 3: Getting Started with the Astronomy Shop 316
Chapter 4: Understanding the OpenTelemetry Specification 317
Chapter 5: Managing the OpenTelemetry Collector 318
Chapter 6: Leveraging OpenTelemetry Instrumentation 320
Chapter 7: Adopting OpenTelemetry 321
Chapter 8: The Power of Context and Correlation 323
Chapter 9: Choosing an Observability Platform 324
Chapter 10: Observability Antipatterns and Pitfalls 326
Chapter 11: Observability at Scale 327
Chapter 12: The Future of Observability 328
Appendix B Introduction 329
Chapter 2: Introducing OpenTelemetry! 330
> Roadmap 330
Chapter 3: Getting Started with the Astronomy Shop 330
> Architecture 330
Chapter 5: Managing the OpenTelemetry Collector 332
Background 332
> Components 332
Chapter 12: The Future of Observability 340
> Code 340
Notes 341
Index 343
Introduction xiv
The Mastering Series xvi
Chapter 1 What Is Observability? 1
Definition 1
Background 4
Cloud Native Era 4
Monitoring Compared to Observability 5
Metadata 8
Dimensionality 9
Cardinality 9
Semantic Conventions 10
Data Sensitivity 10
Signals 10
Metrics 10
Logs 13
Traces 14
Other Signals 20
Collecting Signals 20
Instrumentation 21
Push Versus Pull Collection 22
Data Collection 23
Sampling Signals 26
Observability 27
Platforms 27
Application Performance Monitoring 28
The Bottom Line 28
Notes 30
Chapter 2 Introducing OpenTelemetry! 31
Background 31
Observability Pain Points 31
The Rise of Open Source Software 34
Introducing OpenTelemetry 35
OpenTelemetry Components 37
OpenTelemetry Concepts 48
Roadmap 50
The Bottom Line 50
Notes 51
Chapter 3 Getting Started with the Astronomy Shop 53
Background 53
Architecture 54
Prerequisites 54
Getting Started 55
Accessing the Astronomy Shop 57
Accessing Telemetry Data 57
Beyond the Basics 58
Configuring Load Generation 58
Configuring Feature Flags 59
Configuring Tests Built from Traces 60
Configuring the OTel Collector 60
Configuring OTel Instrumentation 62
Troubleshooting Astronomy Shop 62
Astronomy Shop Scenarios 63
Troubleshooting Errors 63
Troubleshooting Availability 69
Troubleshooting Performance 70
Troubleshooting Telemetry 74
The Bottom Line 75
Notes 76
Chapter 4 Understanding the OpenTelemetry Specification 77
Background 77
API Specification 79
API Definition 80
API Context 80
API Signals 81
API Implementation 82
SDK Specification 82
SDK Definition 83
SDK Signals 83
SDK Implementation 84
Data Specification 84
Data Models 86
Data Protocols 88
Data Semantic Conventions 88
Data Compatibility 89
General Specification 90
The Bottom Line 91
Notes 92
Chapter 5 Managing the OpenTelemetry Collector 93
Background 94
Deployment Modes 95
Agent Mode 96
Gateway Mode 98
Reference Architectures 100
The Basics 101
The Binary 103
Sizing 103
Components 104
Configuration 106
Receivers and Exporters 115
Processors 116
Extensions 126
Connectors 127
Observing 128
Relevant Metrics 128
Health Check Extension 131
zPages Extension 131
Troubleshooting 134
Out of Memory Crashes 134
Data Not Being Received or Exported 134
Performance Issues 135
Beyond the Basics 135
Distributions 135
Securing 137
Management 138
The Bottom Line 140
Notes 141
Chapter 6 Leveraging OpenTelemetry Instrumentation 143
Environment Setup 144
Python Trace Instrumentation 149
Automatic Instrumentation 150
Manual Instrumentation 157
Programmatic Instrumentation 163
Mixing Automatic and Manual Trace Instrumentation 166
Python Metrics Instrumentation 167
Automatic Instrumentation 168
Manual Instrumentation 169
Programmatic Instrumentation 174
Mixing Automatic and Manual Metric Instrumentation 176
Python Log Instrumentation 178
Manual Metadata Enrichment 179
Trace Correlation 181
Language Considerations 183
NET 184
Java 184
Go 184
Node js 185
Deployment Models 185
Distributions 185
The Bottom Line 186
Notes 187
Chapter 7 Adopting OpenTelemetry 189
The Basics 189
Why OTel and Why Now? 190
Where to Start? 191
General Process 192
Data Collection 193
Instrumentation 195
Production Readiness 196
Maturity Framework 197
Brownfield Deployment 198
Data Collection 198
Instrumentation 200
Dashboards and Alerts 202
Greenfield Deployment 204
Data Collection 204
Instrumentation 208
Other Considerations 208
Administration and Maintenance 208
Environments 211
Semantic Conventions 212
The Future 213
The Bottom Line 213
Notes 214
Chapter 8 The Power of Context and Correlation 215
Background 215
Context 217
OTel Context 219
Trace Context 221
Resource Context 223
Logic Context 224
Correlation 225
Time Correlation 225
Context Correlation 226
Trace Correlation 228
Metric Correlation 230
The Bottom Line 230
Notes 231
Chapter 9 Choosing an Observability Platform 233
Primary Considerations 233
Platform Capabilities 235
Marketing Versus Reality 237
Price, Cost, and Value 238
Observability Fragmentation 241
Primary Factors 242
Build, Buy, or Manage 242
Licensing, Operations, and Deployment 244
OTel Compatibility and Vendor Lock-In 244
Stakeholders and Company Culture 245
Implementation Basics 246
Administration 247
Usage 248
Maturity Framework 248
The Bottom Line 250
Notes 250
Chapter 10 Observability Antipatterns and Pitfalls 251
Telemetry Data Missteps 251
Mixing Instrumentation Libraries Scenario 253
Automatic Instrumentation Scenario 253
Custom Instrumentation Scenario 254
Component Configuration Scenario 255
Performance Overhead Scenario 255
Resource Allocation Scenario 256
Security Considerations Scenario 256
Monitoring and Maintenance Scenario 257
Observability Platform Missteps 258
Vendor Lock-in Scenario 260
Fragmented Tooling Scenario 260
Tool Fatigue Scenario 261
Inadequate Scalability Scenario 261
Data Overload Scenario 262
Company Culture Implications 264
Lack of Leadership Support Scenario 265
Resistance to Change Scenario 266
Collaboration and Alignment Scenario 266
Goals and Success Criteria Scenario 267
Standardization and Consistency Scenario 268
Incentives and Recognition Scenario 268
Feedback and Improvement Scenario 269
Prioritization Framework 270
The Bottom Line 272
Notes 273
Chapter 11 Observability at Scale 275
Understanding the Challenges 275
Volume and Velocity of Telemetry Data 276
Distributed System Complexity 278
Observability Platform Complexity 281
Infrastructure and Resource Constraints 281
Strategies for Scaling Observability 282
Elasticity, Elasticity, Elasticity! 282
Leverage Cloud Native Technologies 284
Filter, Sample, and Aggregate 286
Anomaly Detection and Predictive Analytics 290
Emerging Technologies and Methodologies 291
Best Practices for Managing Scale 292
General Recommendations 292
Instrumentation and Data Collection 293
Observability Platform 293
The Bottom Line 294
Notes 295
Chapter 12 The Future of Observability 297
Challenges and Opportunities 297
Cost 297
Complexity 299
Compliance 300
Code 301
Emerging Trends and Innovations 302
Artificial Intelligence 303
Observability as Code 304
Service Mesh 305
eBPF 306
The Future of OpenTelemetry 307
Stabilization and Expansion 308
Expanded Signal Support 308
Unified Query Language 310
Community-driven Innovation 310
The Bottom Line 311
Notes 311
Appendix A The Bottom Line 313
Chapter 1: What Is Observability? 313
Chapter 2: Introducing OpenTelemetry! 315
Chapter 3: Getting Started with the Astronomy Shop 316
Chapter 4: Understanding the OpenTelemetry Specification 317
Chapter 5: Managing the OpenTelemetry Collector 318
Chapter 6: Leveraging OpenTelemetry Instrumentation 320
Chapter 7: Adopting OpenTelemetry 321
Chapter 8: The Power of Context and Correlation 323
Chapter 9: Choosing an Observability Platform 324
Chapter 10: Observability Antipatterns and Pitfalls 326
Chapter 11: Observability at Scale 327
Chapter 12: The Future of Observability 328
Appendix B Introduction 329
Chapter 2: Introducing OpenTelemetry! 330
> Roadmap 330
Chapter 3: Getting Started with the Astronomy Shop 330
> Architecture 330
Chapter 5: Managing the OpenTelemetry Collector 332
Background 332
> Components 332
Chapter 12: The Future of Observability 340
> Code 340
Notes 341
Index 343
Foreword xiii
Introduction xiv
The Mastering Series xvi
Chapter 1 What Is Observability? 1
Definition 1
Background 4
Cloud Native Era 4
Monitoring Compared to Observability 5
Metadata 8
Dimensionality 9
Cardinality 9
Semantic Conventions 10
Data Sensitivity 10
Signals 10
Metrics 10
Logs 13
Traces 14
Other Signals 20
Collecting Signals 20
Instrumentation 21
Push Versus Pull Collection 22
Data Collection 23
Sampling Signals 26
Observability 27
Platforms 27
Application Performance Monitoring 28
The Bottom Line 28
Notes 30
Chapter 2 Introducing OpenTelemetry! 31
Background 31
Observability Pain Points 31
The Rise of Open Source Software 34
Introducing OpenTelemetry 35
OpenTelemetry Components 37
OpenTelemetry Concepts 48
Roadmap 50
The Bottom Line 50
Notes 51
Chapter 3 Getting Started with the Astronomy Shop 53
Background 53
Architecture 54
Prerequisites 54
Getting Started 55
Accessing the Astronomy Shop 57
Accessing Telemetry Data 57
Beyond the Basics 58
Configuring Load Generation 58
Configuring Feature Flags 59
Configuring Tests Built from Traces 60
Configuring the OTel Collector 60
Configuring OTel Instrumentation 62
Troubleshooting Astronomy Shop 62
Astronomy Shop Scenarios 63
Troubleshooting Errors 63
Troubleshooting Availability 69
Troubleshooting Performance 70
Troubleshooting Telemetry 74
The Bottom Line 75
Notes 76
Chapter 4 Understanding the OpenTelemetry Specification 77
Background 77
API Specification 79
API Definition 80
API Context 80
API Signals 81
API Implementation 82
SDK Specification 82
SDK Definition 83
SDK Signals 83
SDK Implementation 84
Data Specification 84
Data Models 86
Data Protocols 88
Data Semantic Conventions 88
Data Compatibility 89
General Specification 90
The Bottom Line 91
Notes 92
Chapter 5 Managing the OpenTelemetry Collector 93
Background 94
Deployment Modes 95
Agent Mode 96
Gateway Mode 98
Reference Architectures 100
The Basics 101
The Binary 103
Sizing 103
Components 104
Configuration 106
Receivers and Exporters 115
Processors 116
Extensions 126
Connectors 127
Observing 128
Relevant Metrics 128
Health Check Extension 131
zPages Extension 131
Troubleshooting 134
Out of Memory Crashes 134
Data Not Being Received or Exported 134
Performance Issues 135
Beyond the Basics 135
Distributions 135
Securing 137
Management 138
The Bottom Line 140
Notes 141
Chapter 6 Leveraging OpenTelemetry Instrumentation 143
Environment Setup 144
Python Trace Instrumentation 149
Automatic Instrumentation 150
Manual Instrumentation 157
Programmatic Instrumentation 163
Mixing Automatic and Manual Trace Instrumentation 166
Python Metrics Instrumentation 167
Automatic Instrumentation 168
Manual Instrumentation 169
Programmatic Instrumentation 174
Mixing Automatic and Manual Metric Instrumentation 176
Python Log Instrumentation 178
Manual Metadata Enrichment 179
Trace Correlation 181
Language Considerations 183
NET 184
Java 184
Go 184
Node js 185
Deployment Models 185
Distributions 185
The Bottom Line 186
Notes 187
Chapter 7 Adopting OpenTelemetry 189
The Basics 189
Why OTel and Why Now? 190
Where to Start? 191
General Process 192
Data Collection 193
Instrumentation 195
Production Readiness 196
Maturity Framework 197
Brownfield Deployment 198
Data Collection 198
Instrumentation 200
Dashboards and Alerts 202
Greenfield Deployment 204
Data Collection 204
Instrumentation 208
Other Considerations 208
Administration and Maintenance 208
Environments 211
Semantic Conventions 212
The Future 213
The Bottom Line 213
Notes 214
Chapter 8 The Power of Context and Correlation 215
Background 215
Context 217
OTel Context 219
Trace Context 221
Resource Context 223
Logic Context 224
Correlation 225
Time Correlation 225
Context Correlation 226
Trace Correlation 228
Metric Correlation 230
The Bottom Line 230
Notes 231
Chapter 9 Choosing an Observability Platform 233
Primary Considerations 233
Platform Capabilities 235
Marketing Versus Reality 237
Price, Cost, and Value 238
Observability Fragmentation 241
Primary Factors 242
Build, Buy, or Manage 242
Licensing, Operations, and Deployment 244
OTel Compatibility and Vendor Lock-In 244
Stakeholders and Company Culture 245
Implementation Basics 246
Administration 247
Usage 248
Maturity Framework 248
The Bottom Line 250
Notes 250
Chapter 10 Observability Antipatterns and Pitfalls 251
Telemetry Data Missteps 251
Mixing Instrumentation Libraries Scenario 253
Automatic Instrumentation Scenario 253
Custom Instrumentation Scenario 254
Component Configuration Scenario 255
Performance Overhead Scenario 255
Resource Allocation Scenario 256
Security Considerations Scenario 256
Monitoring and Maintenance Scenario 257
Observability Platform Missteps 258
Vendor Lock-in Scenario 260
Fragmented Tooling Scenario 260
Tool Fatigue Scenario 261
Inadequate Scalability Scenario 261
Data Overload Scenario 262
Company Culture Implications 264
Lack of Leadership Support Scenario 265
Resistance to Change Scenario 266
Collaboration and Alignment Scenario 266
Goals and Success Criteria Scenario 267
Standardization and Consistency Scenario 268
Incentives and Recognition Scenario 268
Feedback and Improvement Scenario 269
Prioritization Framework 270
The Bottom Line 272
Notes 273
Chapter 11 Observability at Scale 275
Understanding the Challenges 275
Volume and Velocity of Telemetry Data 276
Distributed System Complexity 278
Observability Platform Complexity 281
Infrastructure and Resource Constraints 281
Strategies for Scaling Observability 282
Elasticity, Elasticity, Elasticity! 282
Leverage Cloud Native Technologies 284
Filter, Sample, and Aggregate 286
Anomaly Detection and Predictive Analytics 290
Emerging Technologies and Methodologies 291
Best Practices for Managing Scale 292
General Recommendations 292
Instrumentation and Data Collection 293
Observability Platform 293
The Bottom Line 294
Notes 295
Chapter 12 The Future of Observability 297
Challenges and Opportunities 297
Cost 297
Complexity 299
Compliance 300
Code 301
Emerging Trends and Innovations 302
Artificial Intelligence 303
Observability as Code 304
Service Mesh 305
eBPF 306
The Future of OpenTelemetry 307
Stabilization and Expansion 308
Expanded Signal Support 308
Unified Query Language 310
Community-driven Innovation 310
The Bottom Line 311
Notes 311
Appendix A The Bottom Line 313
Chapter 1: What Is Observability? 313
Chapter 2: Introducing OpenTelemetry! 315
Chapter 3: Getting Started with the Astronomy Shop 316
Chapter 4: Understanding the OpenTelemetry Specification 317
Chapter 5: Managing the OpenTelemetry Collector 318
Chapter 6: Leveraging OpenTelemetry Instrumentation 320
Chapter 7: Adopting OpenTelemetry 321
Chapter 8: The Power of Context and Correlation 323
Chapter 9: Choosing an Observability Platform 324
Chapter 10: Observability Antipatterns and Pitfalls 326
Chapter 11: Observability at Scale 327
Chapter 12: The Future of Observability 328
Appendix B Introduction 329
Chapter 2: Introducing OpenTelemetry! 330
> Roadmap 330
Chapter 3: Getting Started with the Astronomy Shop 330
> Architecture 330
Chapter 5: Managing the OpenTelemetry Collector 332
Background 332
> Components 332
Chapter 12: The Future of Observability 340
> Code 340
Notes 341
Index 343
Introduction xiv
The Mastering Series xvi
Chapter 1 What Is Observability? 1
Definition 1
Background 4
Cloud Native Era 4
Monitoring Compared to Observability 5
Metadata 8
Dimensionality 9
Cardinality 9
Semantic Conventions 10
Data Sensitivity 10
Signals 10
Metrics 10
Logs 13
Traces 14
Other Signals 20
Collecting Signals 20
Instrumentation 21
Push Versus Pull Collection 22
Data Collection 23
Sampling Signals 26
Observability 27
Platforms 27
Application Performance Monitoring 28
The Bottom Line 28
Notes 30
Chapter 2 Introducing OpenTelemetry! 31
Background 31
Observability Pain Points 31
The Rise of Open Source Software 34
Introducing OpenTelemetry 35
OpenTelemetry Components 37
OpenTelemetry Concepts 48
Roadmap 50
The Bottom Line 50
Notes 51
Chapter 3 Getting Started with the Astronomy Shop 53
Background 53
Architecture 54
Prerequisites 54
Getting Started 55
Accessing the Astronomy Shop 57
Accessing Telemetry Data 57
Beyond the Basics 58
Configuring Load Generation 58
Configuring Feature Flags 59
Configuring Tests Built from Traces 60
Configuring the OTel Collector 60
Configuring OTel Instrumentation 62
Troubleshooting Astronomy Shop 62
Astronomy Shop Scenarios 63
Troubleshooting Errors 63
Troubleshooting Availability 69
Troubleshooting Performance 70
Troubleshooting Telemetry 74
The Bottom Line 75
Notes 76
Chapter 4 Understanding the OpenTelemetry Specification 77
Background 77
API Specification 79
API Definition 80
API Context 80
API Signals 81
API Implementation 82
SDK Specification 82
SDK Definition 83
SDK Signals 83
SDK Implementation 84
Data Specification 84
Data Models 86
Data Protocols 88
Data Semantic Conventions 88
Data Compatibility 89
General Specification 90
The Bottom Line 91
Notes 92
Chapter 5 Managing the OpenTelemetry Collector 93
Background 94
Deployment Modes 95
Agent Mode 96
Gateway Mode 98
Reference Architectures 100
The Basics 101
The Binary 103
Sizing 103
Components 104
Configuration 106
Receivers and Exporters 115
Processors 116
Extensions 126
Connectors 127
Observing 128
Relevant Metrics 128
Health Check Extension 131
zPages Extension 131
Troubleshooting 134
Out of Memory Crashes 134
Data Not Being Received or Exported 134
Performance Issues 135
Beyond the Basics 135
Distributions 135
Securing 137
Management 138
The Bottom Line 140
Notes 141
Chapter 6 Leveraging OpenTelemetry Instrumentation 143
Environment Setup 144
Python Trace Instrumentation 149
Automatic Instrumentation 150
Manual Instrumentation 157
Programmatic Instrumentation 163
Mixing Automatic and Manual Trace Instrumentation 166
Python Metrics Instrumentation 167
Automatic Instrumentation 168
Manual Instrumentation 169
Programmatic Instrumentation 174
Mixing Automatic and Manual Metric Instrumentation 176
Python Log Instrumentation 178
Manual Metadata Enrichment 179
Trace Correlation 181
Language Considerations 183
NET 184
Java 184
Go 184
Node js 185
Deployment Models 185
Distributions 185
The Bottom Line 186
Notes 187
Chapter 7 Adopting OpenTelemetry 189
The Basics 189
Why OTel and Why Now? 190
Where to Start? 191
General Process 192
Data Collection 193
Instrumentation 195
Production Readiness 196
Maturity Framework 197
Brownfield Deployment 198
Data Collection 198
Instrumentation 200
Dashboards and Alerts 202
Greenfield Deployment 204
Data Collection 204
Instrumentation 208
Other Considerations 208
Administration and Maintenance 208
Environments 211
Semantic Conventions 212
The Future 213
The Bottom Line 213
Notes 214
Chapter 8 The Power of Context and Correlation 215
Background 215
Context 217
OTel Context 219
Trace Context 221
Resource Context 223
Logic Context 224
Correlation 225
Time Correlation 225
Context Correlation 226
Trace Correlation 228
Metric Correlation 230
The Bottom Line 230
Notes 231
Chapter 9 Choosing an Observability Platform 233
Primary Considerations 233
Platform Capabilities 235
Marketing Versus Reality 237
Price, Cost, and Value 238
Observability Fragmentation 241
Primary Factors 242
Build, Buy, or Manage 242
Licensing, Operations, and Deployment 244
OTel Compatibility and Vendor Lock-In 244
Stakeholders and Company Culture 245
Implementation Basics 246
Administration 247
Usage 248
Maturity Framework 248
The Bottom Line 250
Notes 250
Chapter 10 Observability Antipatterns and Pitfalls 251
Telemetry Data Missteps 251
Mixing Instrumentation Libraries Scenario 253
Automatic Instrumentation Scenario 253
Custom Instrumentation Scenario 254
Component Configuration Scenario 255
Performance Overhead Scenario 255
Resource Allocation Scenario 256
Security Considerations Scenario 256
Monitoring and Maintenance Scenario 257
Observability Platform Missteps 258
Vendor Lock-in Scenario 260
Fragmented Tooling Scenario 260
Tool Fatigue Scenario 261
Inadequate Scalability Scenario 261
Data Overload Scenario 262
Company Culture Implications 264
Lack of Leadership Support Scenario 265
Resistance to Change Scenario 266
Collaboration and Alignment Scenario 266
Goals and Success Criteria Scenario 267
Standardization and Consistency Scenario 268
Incentives and Recognition Scenario 268
Feedback and Improvement Scenario 269
Prioritization Framework 270
The Bottom Line 272
Notes 273
Chapter 11 Observability at Scale 275
Understanding the Challenges 275
Volume and Velocity of Telemetry Data 276
Distributed System Complexity 278
Observability Platform Complexity 281
Infrastructure and Resource Constraints 281
Strategies for Scaling Observability 282
Elasticity, Elasticity, Elasticity! 282
Leverage Cloud Native Technologies 284
Filter, Sample, and Aggregate 286
Anomaly Detection and Predictive Analytics 290
Emerging Technologies and Methodologies 291
Best Practices for Managing Scale 292
General Recommendations 292
Instrumentation and Data Collection 293
Observability Platform 293
The Bottom Line 294
Notes 295
Chapter 12 The Future of Observability 297
Challenges and Opportunities 297
Cost 297
Complexity 299
Compliance 300
Code 301
Emerging Trends and Innovations 302
Artificial Intelligence 303
Observability as Code 304
Service Mesh 305
eBPF 306
The Future of OpenTelemetry 307
Stabilization and Expansion 308
Expanded Signal Support 308
Unified Query Language 310
Community-driven Innovation 310
The Bottom Line 311
Notes 311
Appendix A The Bottom Line 313
Chapter 1: What Is Observability? 313
Chapter 2: Introducing OpenTelemetry! 315
Chapter 3: Getting Started with the Astronomy Shop 316
Chapter 4: Understanding the OpenTelemetry Specification 317
Chapter 5: Managing the OpenTelemetry Collector 318
Chapter 6: Leveraging OpenTelemetry Instrumentation 320
Chapter 7: Adopting OpenTelemetry 321
Chapter 8: The Power of Context and Correlation 323
Chapter 9: Choosing an Observability Platform 324
Chapter 10: Observability Antipatterns and Pitfalls 326
Chapter 11: Observability at Scale 327
Chapter 12: The Future of Observability 328
Appendix B Introduction 329
Chapter 2: Introducing OpenTelemetry! 330
> Roadmap 330
Chapter 3: Getting Started with the Astronomy Shop 330
> Architecture 330
Chapter 5: Managing the OpenTelemetry Collector 332
Background 332
> Components 332
Chapter 12: The Future of Observability 340
> Code 340
Notes 341
Index 343