Why Google stopped 'feeding the machines with the blood, sweat, and tears of human beings'
The search giant maintains 15 data centers all over the world. Each of those data centers needs to be kept up and running 24-7 or the company's empire of apps and services crumbles.
That's why, early on in its history, Google turned to data center automation: A combination of software and hardware that automatically handles difficult or repetitive tasks, such that a relative few Googlers can keep things humming along all over the world.
In the new book "Site Reliability Engineering: How Google Runs Production Systems," Google goes into some nerdy depths as to the hows and whys of its data centers.
But this illustrative quote from former Googler Joseph Bironas, now managing data center growth at Google, shows why the company spends so much time thinking about automation:
If we are engineering processes and solutions that are not automatable, we continue having to staff humans to maintain the system. If we have to staff humans to do the work, we are feeding the machines with the blood, sweat, and tears of human beings. Think The Matrix with less special effects and more pissed off System Administrators.
As the book explains, the end-goal of automating its data center processes is to make it so that any Googler anywhere in the world can perform even the most complex operations in the data center, if necessary. You don't need to telepathically know what the original system designer was thinking.
In some places, Google is trying to take humans out of the equation entirely. It's building automated systems where it makes sense, and wherever possible. But sometimes, you just need a human touch.
"Of course, although Google is ideologically bent upon using machines to manage machines where possible, reality requires some modification of our approach," write Google chapter authors Niall Murphy with John Looney and Michael Kacirek. "Some essential systems started out as quick prototypes, not designed to last or to interface with automation."
Site Reliability Engineering: How Google Runs Production Systems is available now.