Australian oil and gas giant, Woodside, prides itself on its technology and engineering heritage. So when it came to tapping historical and streaming data stores to improve operational efficiency, as well as predict and circumvent potential issues in its production facilities, the company went for the most cutting-edge machine learning technology on offer: IBM’s Watson.
Speaking at the recent Chief Analytics Officer Forum in Sydney, Woodside’s principal data scientist, Elsa Jordan, shared with attendees the company’s journey to build a data science capability from scratch in one year that could be utilised by employees right across the organisation.
Woodside established its data science practice in January 2015 and as part of its approach, is running the largest commercial instance of the Watson advisor engine.
“Our premise was think big, prototype small, scale fast,” Jordan said. “Key to that was using machine learning algorithms. What appealed was that we could learn from history, predict from streaming data and keep on learning as new information becomes available.”
The key question was how Woodside could capture and utilise the 20-odd years of valuable data on its energy projects, utilise and exploit this know-how from historical projects, then make this information available to employees at any time, Jordan said.
“Data science is the answer to that – it provides the speed to access knowledge and the ability to derive insights from our data,” she said.
Any tech or business investment of such considerable size usually has a trigger, and for Woodside it was a rare event in 2013 with the potential to shut down an entire production facility set the data science train in motion.
The event revolved how a production facility strips acid gas from carbon stream prior to liquefaction. Much like gas in a shaken-up Coke bottle, this ‘foaming’ can build up inside the unit and in Woodside’s case, result in significant production outages. The problem is that engineers can’t tell whether that ‘bottle’ has been shaken up or not.
“Similarly, we can’t tell if firming is happening in our cold vessels, and there isn’t a piece of equipment that can measure this,” Jordan explained. “We have to rely on proxies, like pressure measurements.”
Woodside pulled together its engineers to look at data after such an event to see whether there were any telltale signs that it was looming.
“The question was whether we had the pre-event data that could warn us in the future, and the answer was yes,” Jordan continued. “But it’s encoded into millions of rows, from readings across thousands of sensors, to logs and regular data.”
Woodside turned to machine learning to handle the vast volumes of data needing to be analysed. The resulting model provided a probabilistic reading on the likelihood of the foaming situation happening, and importantly, could flag it well in advance.
“This would give engineers time to go and investigate potential issues,” Jordan said. “More than just predictions, the model also actually captured vast knowledge across 20 years of running these systems, then allowed us to transfer this knowledge to other facilities so we could prevent these events in those location, too.”
This not only requires the ability to churn through data, but also to keep analysing fresh streaming data to keep on predicting on a 10-minute or hourly basis the likelihood of an event. Streaming data adds about 10GB of data to Woodside’s data lake per day, while calculations go into the millions.
Woodside is using AWS Big Data platform to do this, which Jordan said can scale quickly and cost effectively.
“We need to be at the cutting edge of technology for scale and to increase our performance,” she said.
“We’re an early adopter of Apache Spark technology, and have open source analytical toolkits running including R. We needed to be able to quickly test out and adopt data science technologies so we can be at the front end of the curve.”
Bringing in Watson
Woodside has also brought on Watson technology to help better tap the wealth of expertise, insight and knowledge across the organisation and in its data stores. The machine learning engine is being used to analyse about 200 million pages of technical documents and reports.
Jordan said engineers can ask Watson questions, and it comes back with answer based on searching these documents. Most importantly, an experienced engineer takes that answer, looks at whether it’s correct or relevant, and feeds back intelligence into the system, improving the accuracy of results for future reference.
“Over time, and as you feed back in responses and feedback, it becomes more self-reliant and accurate,” Jordan said. “That’s what makes Watson fundamentally different to other types of search engines.”
In one example, Watson worked with a Woodside engineer designing a new offshore platform and concerned with how to manage seabeds loosening. Rather than having to scroll through thousands of potentially relevant documents, or having to find the one individual in the business with historical knowledge on these issues, the engineer used Watson and within seconds had relevant information to act on.
“Watson wasn’t trying to answer the question, and the main focus of the document wasn’t necessarily about the specific problem we were trying to solve, it was just one of the subsections,” Jordan stressed. “But that timesaving for our engineer in finding the answer he needed was huge.”
Jordan also noted Watson doesn’t give an opinion, it gives factual answers back based on information found within the reports it can find.
“It remains with the engineers to decide if the answer is right, wrong or relevant. An answer doesn’t mean it’s exactly right,” she said. “That’s why it’s a research assistant, rather than the decision maker.”
Watson is currently in production at Woodside and being used by hundreds of people, and has ingested 20,000 documents. This doesn’t happen at the push of a button, however, and Jordan said it took between six and eight months to train up Watson.
“Part of that was it required a lot of collaboration from the business,” she said. “We needed a number of relevant questions engineers could ask. We asked the business and they gave us questions, but importantly, they came and gave up their time to train Watson.
“There were hours and hours of training sessions where Watson would go out with a number of rules-based, machine learning algorithms and come back with proposed answers, then engineers would evaluate them. It’s then that we applied machine learning algorithms on top of those supervised scenarios where we know the right answers should be, or know it’s not good answer.”
As a result, success is dependent on the enthusiasm and input of the rest of the organisation, Jordan said.
“The other lesson is that not everything needs to be that big,” she commented. “You don’t need an engagement adviser for every problem in the company. We are now deploying a number of other solutions with other Watson products which doesn’t require that amount of training and time.”
Ultimately, the investment in Watson is being quantified by the confidence it’s giving teams in data-driven decision making, Jordan said.
“This allows us to look at all of our history and expertise - we don’t reinvent the wheel and don’t want to make the same mistake again,” she added. “If we can avoid making the same mistake, 10 years later, then there’s the value.”
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.