Mastering Site Reliability Engineering in Enterprise

A Complete Guide to Resilient Systems & Chaos Engineering

Fotogalerie

Florian Hoeppner, Francesco Sbaraglia

Mastering Site Reliability Engineering in Enterprise

A Complete Guide to Resilient Systems & Chaos Engineering

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Andere Kunden interessierten sich auch für

Pieter-Jan Nefkens
Transforming Campus Networks to Intent-Based Networking

68,99 €
Nathan Muller
Wi-Fi for the Enterprise

53,99 €
Information Technology Convergence

329,99 €
Manoj Kuppam
Enterprise Digital Reliability

32,99 €
Seyed Javad Kazemitabar
Coping with Interference in Wireless Networks

75,99 €
Jake Switzer
MC Microsoft Certified Azure Data Fundamentals Study Guide with Online Labs: Exam Dp-900

118,99 €
Cornelius T. Leondes
Implementation Techniques

71,99 €

Produktbeschreibung

Implement site reliability engineering (SRE) practices in an enterprise IT environment and manage its complete lifecycle. This book is a comprehensive guide designed to help site reliability engineers, DevOps teams, and platform engineers identify, address, and mitigate system vulnerabilities before they escalate into significant issues.

The authors highlight the shift from IT as a cost centre to a core business function, emphasising the central role of developers and the need for speed and reliability. They detail the challenges of transitioning to SRE, including overcoming cultural resistance and legacy infrastructure limitations, while emphasising the importance of building resilience in systems and processes. Specific SRE capabilities like chaos engineering, observability, and toil management are explored, along with strategies for successful implementation, including building a Center of Excellence, selecting the right tools, and fostering a culture of collaboration and continuous improvement. Finally, the texts discuss emerging trends like the use of generative AI (GenAI) in SRE and the future evolution of chaos engineering.

You'll learn how to integrate SRE practices into your existing enterprise tech operating model and see how these methodologies provide significant business value by reducing system downtime and enhancing operational stability. Additionally, this book will explore how GenAI can support SRE teams in planning, executing, and optimising reliability experiments and automating toil reduction and continuous improvement efforts.

By the end of this book, you'll be fully equipped to build chaos engineering by SREs, run reliability-focused "game days" to improve observability, troubleshoot failure scenarios, and strengthen the digital resilience of your systems and teams.

What You Will Learn
Understand the key terms and history of SRE and its guiding principlesGet insights into the SRE role and its evolution Overcome the challenges in adopting SRE at any level of the organisation Identify site reliability building blocks maturity readiness to improve digital resilience

Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.

Produktdetails

Produktdetails
Verlag: Apress / Springer, Berlin
Artikelnr. des Verlages: 979-8-8688-1447-1
First Edition
Seitenzahl: 250
Erscheinungstermin: 26. September 2025
Englisch
Abmessung: 235mm x 155mm
ISBN-13: 9798868814471
Artikelnr.: 73466359

Herstellerkennzeichnung
Libri GmbH
Europaallee 1
36244 Bad Hersfeld
gpsr@libri.de

Produktdetails

Verlag: Apress / Springer, Berlin
Artikelnr. des Verlages: 979-8-8688-1447-1
First Edition
Seitenzahl: 250
Erscheinungstermin: 26. September 2025
Englisch
Abmessung: 235mm x 155mm
ISBN-13: 9798868814471
Artikelnr.: 73466359

Herstellerkennzeichnung
Libri GmbH
Europaallee 1
36244 Bad Hersfeld
gpsr@libri.de

Autorenporträt

Francesco is a distinguished Site Reliability Engineer (SRE) and a recognised expert in the field of Chaos Engineering and DevOps. With an extensive career spanning over two decades, Francesco has garnered a wealth of hands-on experience as a practitioner and innovator, establishing a profound mastery of cutting-edge AIOPS technologies and methodologies. In addition to his technical prowess, Francesco has distinguished himself as an accomplished author, contributing numerous insightful tech articles and authoritative books across a spectrum of subjects surrounding SRE, Chaos Engineering, operations, and DevOps. Francesco is also an author and public speaker, sharing his insights and best practices in SRE, observability, and chaos engineering at renowned industry conferences, such as SRECon21 and DevOpsCon. He is passionate about combining systems engineering principles with observability tools to ensure seamless operations and improve software engineering practices. Florian Hoeppner is a seasoned professional technology strategist and advisor for tech operating models. He is an Enterprise Site Reliability Engineer subject-matter -expert and DevOps expert with a deep understanding of tech operating model transformations. Florian is passionate about tech strategy, combined build-run teams, and optimising tech operations, and he has spoken and published extensively on these topics. He created a professional global community in his organisation with more than 500 members, constantly sharing and evaluating the latest around these critical topics. He is also the creator of the EngineeringOps radar, a yearly publication showing tech engineering and operational capabilities. He holds a degree in Media Information Systems and a Master of Science in Digital Media. Florian currently lives in New York and has a blog that offers practical insights into SRE, Chaos Engineering, and DevOps practices and solutions on an enterprise level. He has published the book "Competition as Motivation" with AV Akademikerverlag.

Inhaltsangabe

Chapter 1: Site Reliability Engineering 101.- Chapter 2: Scale Site Reliability Engineering Capability in Enterprise.- Chapter 3: Site Reliability Engineering: best in class and case studies.- Chapter 4: Future of Site Reliability Engineering.- Chapter 5: Deep dive into essential Site Reliability Engineering practices: Chaos Engineering and Observability.- Chapter 6: Learn how to start and on what to focus in complex Enterprise.

Inhaltsangabe