Google’s Site Reliability Engineering book is about the principles of Site Reliability Engineering, otherwise known as SRE, and which Andi Mann, chief technology advocate for Splunk, described as “giving IT operations its mojo back.”
SRE is a discipline incorporating aspects of software engineering, applying it to IT operations problems. The main goals are to create ultra-scalable and highly-reliable software systems.
It arose out of Google's own need to find ways to manage large-scale systems and at the same time introduce new features continuously, all the while maintaining a high-quality end-user experience.
The SRE book can be read free online, or purchased as a physical book, while the free e-book deal is for Google’s brand new companion book, The Site Reliability Workbook, and which can be downloaded as a complete ebook at no charge until 23 August.
The SRE book is about the principles of SRE, while the workbook will help organisations implement SRE.
It is the hands-on companion, using concrete examples to show how to put SRE principles and practices to work. This book contains practical examples from Google’s experiences and case studies from Google’s Cloud Platform customers. Evernote, The Home Depot, The New York Times, and other companies outline hard-won experiences of what worked for them and what didn’t.
You can find it for free here.