Machine learning will have a huge impact on business and society but at present is “still a cottage industry”, says Professor Bob Williamson, chief scientist of CSIRO’s Data 61 group.
There’s been a resurgence of interest in machine learning in recent years. Though it’s not a new concept, factors like Big Data, the availability of more powerful computational processing and cheaper data storage, means more CIOs are investigating its applications.
But there’s still “an awful lot of black art” in the discipline according to Williamson, who specialises in machine learning and analytics at Data 61.
Speaking at a SAS customer event in Sydney, Williamson outlined the need for those working in the field to better share and standardise their work.
“There’s very few standards. There’s very little reuse,” he explained. “Plenty of my colleagues will do things from scratch. You can’t share anything, there’s always different systems.
“There’s these walled gardens: ‘I’ve gone and coded my models in a particular way, you’ve got your models coded in a different way, we can’t share’. This is a real challenge for the community. No one’s cracked this yet.”
Machine learning "as-a-service"
Though these MLaaSs herald some impressive results, Williamson warned businesses to be cautious.
“It’s very technique driven. Pretty well every provider of analytics solutions will say ‘look at the techniques I’ve got – I’ve got some core vector machines, I invented one of the core vector machine algorithms, it’s a great technique’,” he said.
“It’s still a technique. How do you know for your problems that it’s useful? You don’t.”
Prof Williamson also urged businesses to remember to conduct experiments, not simply collect data and look for patterns.
“In science ‘data mining’… that phrase, used to be prejorative. If someone presented a scientific paper where they just trawled the data and they didn’t really have an idea of what they were doing, they just saw some pattern, this was frowned upon.
“You’re gathering the data. But it’s the experiment that comes first. For me this is the real promise of a data driven world. It’s an experimenting world.”