Monday, 18 June 2018 23:47

The day IT operations got its mojo back


The advent of site reliability engineering and observability gives new skills and techniques to the operations side of DevOps, says Andi Mann, chief technology advocate for machine data aggregator and analysis vendor, Splunk.

Mann is an Australian living in Boulder, Colorado, with a global audience. In his role, he is charged with learning and researching what is important to Splunk's customers, understanding what leading-edge customers are doing, identifying what Splunk should adopt into its a product roadmap, what it can do to make customer's lives more successful, and advocating to customers about technologies and Splunk capabilities they can utilise to be better at what they do.

Mann is currently in Sydney to speak at Splunk Live, introducing customer stories about their innovative use of Splunk. "I always love doing Splunk Live. Any day with a customer is a good day," he says.

Mann took time from his busy schedule to speak to iTWire about what's currently caught his interest in all these discussions and research.

IT Operations

"I'm seeing a resurgence of IT Ops," he said. "So many businesses and vendors and analysts stop talking about DevOps when it comes to app release."

Two years ago, a group of Google Engineers wrote a book, "Site Reliability Engineering", published by O'Reilly Media, and its concepts have gained traction. Part of this includes observability, Mann explained.

"Observability," he says, is a term that comes from industrial manufacturing where you have systems you cannot see into, for example, a water treatment plant has pipes all over the place and can't see what's happening inside. Is the water dirty or clean? Which direction does it flow? Is the pipe full or not? To answer these questions engineers installed purity sensors which allow them to observe from the outside, using telemetry to see what's going on inside the pipe.

Google has brought this concept into IT to describe how new operations models can get visibility into applications – and with this, get better data and better metrics, and thus get ahead of problems.

DevOps brought a lot of goodness in collaboration across the entire software development lifecycle, Mann says. It's been good for developers, giving access to IT operations capabilities like automated software release, troubleshooting and triage.

However, "observability and software reliability engineering gives IT Ops their mojo back," Mann says.


Another current topic is Splunk's announcement of its agreement to acquire VictorOps.

"VictorOps has great talent, which is a significant part of why we wanted to bring that team aboard," Andi Mann says. "They have forward-looking tech which brings together teams to not just review problems but collaborate on triaging, troubleshooting and launching automation to fix problems."

"The team is fantastic," he reiterates. It also establishes an official Splunk presence in Mann's hometown of Boulder, Colorado, advancing the "Silicon Mountain" moniker.

"Google has recently built a complex for 1500 people in Boulder. This is the calibre of town Boulder is for technical talent and I'm excited we (Splunk) have an opportunity to attract and retain talent."

VictorOps is a beautiful fit for Splunk, Mann says.

The OODA loop is the decision cycle of observe, orient, decide and act, Mann explains, developed by military strategist and United States Air Force Colonel John Boyd.

These first two phases are where Splunk "lives and breathes" – do we have a problem at all, and what is the problem? The conventional Splunk monitoring and analytics tools, augmented by machine learning-driven event analytics, does this out-of-the-box, allowing teams to see when things are going wrong and identifying the notable cause causing the problem.

VictorOps comes in at phase three — how do we work together to solve the problem? — providing a modern, cloud-based system that incorporates ideas around triaging and troubleshooting together. Using VictorOps multiple people can be geographically distributed but work in the one chatroom, pulling in Splunk dashboards, and getting the right people together at the right time.

Then, when the resolution is agreed upon, automations can be kicked off from right within the VictorOps chatroom, being Splunk actions or other third-party integrations.

Splunk previously acquired Phantom Cyber Corporation as an orchestration solution, to execute a workflow to implement known processes. Phantom gives the opportunity to execute recovery actions, working on how all the pieces are seamlessly integrated.

Thus, with the combination of Splunk and VictorOps, Mann says, IT teams can go all the way from "Aha, I have a problem. What is it? Let's work together to get the right people to make a decision after triaging and troubleshooting, then let's use Phantom, or maybe Puppet, Chef, or something else, to go and resolve that problem."

This is why Splunk speaks about a "platform for engagement", Mann says. "It's not just a monitor in a corner nobody looks at, and it's not just spitting out metrics. It enables IT pros to make decisions and act on them and return service to normal, all the while engaging with different teams – it's a platform for engagement."

Splunk has been working with VictorOps for a while, Mann says. He personally facilitated some early integrations which were literally customer-led. "Customers were asking us to work together so we released a two-way integration last year, with the ability to send alerts directly out of Splunk IT Service Intelligence (ITSI) to isolate a notable event using ML and integrate it in the GUI to send an alert to VictorOps.

"Customers said that's great, we know what the problem is when it happens but we need to work together in Splunk to fix the problem. So we continued to work on that integration to literally drop Splunk dashboards into a VictorOps chatroom and see the same information and speak the same language – so this integration has been around for a year or so.

"VictorOps doesn't just solve a problem and make Splunk a better platform for engagement. It's something our customers have proven for us works in a production environment, and it's a great acquisition because we've been doing that for a year or more."

Read 7157 times

Please join our community here and become a VIP.

Subscribe to ITWIRE UPDATE Newsletter here
JOIN our iTWireTV our YouTube Community here


Thoughtworks presents XConf Australia, back in-person in three cities, bringing together people who care deeply about software and its impact on the world.

In its fifth year, XConf is our annual technology event created by technologists for technologists.

Participate in a robust agenda of talks as local thought leaders and Thoughtworks technologists share first-hand experiences and exchange new ways to empower teams, deliver quality software and drive innovation for responsible tech.

Explore how at Thoughtworks, we are making tech better, together.

Tickets are now available and all proceeds will be donated to Indigitek, a not-for-profit organisation that aims to create technology employment pathways for First Nations Peoples.

Click the button below to register and get your ticket for the Melbourne, Sydney or Brisbane event



It's all about Webinars.

Marketing budgets are now focused on Webinars combined with Lead Generation.

If you wish to promote a Webinar we recommend at least a 3 to 4 week campaign prior to your event.

The iTWire campaign will include extensive adverts on our News Site and prominent Newsletter promotion and Promotional News & Editorial. Plus a video interview of the key speaker on iTWire TV which will be used in Promotional Posts on the iTWire Home Page.

Now we are coming out of Lockdown iTWire will be focussed to assisting with your webinars and campaigns and assistance via part payments and extended terms, a Webinar Business Booster Pack and other supportive programs. We can also create your adverts and written content plus coordinate your video interview.

We look forward to discussing your campaign goals with you. Please click the button below.


David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.

Share News tips for the iTWire Journalists? Your tip will be anonymous




Guest Opinion

Guest Interviews

Guest Reviews

Guest Research

Guest Research & Case Studies

Channel News