Machine learning is infiltrating many industries. Marketers are using complex data algorithms to target customers based on their behaviours, while urban planning firms are creating better transport systems, and health organisations are detecting diseases earlier.
Last year, Amazon professor of machine learning at the University of Washington, Carlos Guestrin, said that in the next five years, every successful breakthrough app will use these methods at its core.
But in the highly complex field of cancer, it’s a more laborious and challenging task, according to professor Mathukumalli Vidayasagar, a US-based control theorist who has been working with machine learning methods since the 1990s.
Vidayasagar is a Fellow of the Royal Society at the University of Texas and keynote speaker at the University of Melbourne’s ‘Thinking Machines in the Physical World’ conference yesterday. He discussed how machine learning can help analyse molecular data from cancer tumours.
He told CIO Australia that applying machine learning techniques to cancer problems is challenging as the data “isn’t very clean.”
“The biological processes are very noisy so whatever measurements you take, they are not very repeatable,” he said. “In other words, if we were to take the same measurement on the same tumour on the same instrument on two consecutive days, you will get somewhat different numbers.”
This is something that would never happen in the engineering field for instance, he said.
“If you measure a voltage today or tomorrow, you get exactly the same answer. If you measure a voltage with two different company’s volt meters, you’ll get the same answer. That kind of thing is just not true in biology so that’s what makes the data very noisy.”
Vidayasagar said he approaches this problem by inventing new machine learning algorithms as they are needed.
For instance, some of these have been applied to help predict the efficacy of drugs on lung cancer, and determine how long it will be before ovarian cancer patients relapse so clinicians can better schedule follow ups. Others are used to crunch data relating to various uterine and breast cancer subtypes.
Vidayasagar said the speed by which machine data can be processed in cancer biology is of little consequence unlike other areas such as detecting credit card fraud in e-commerce transactions where number crunching must be done in real time.
“It will take two, three or four years to generate [cancer] data. So whether your algorithm takes one day or one week, it doesn’t make the slightest difference.”
Vidayasagar’s comments come as the use of machine learning and big data analysis techniques gain more traction in the cancer sector.
Last March, biotechnology firm, Mitra released a study on predicting different patients’ clinical response to anti-cancer drugs using a machine learning algorithm.
Several hospitals in the United States are also using big data analytics to unlock the secrets of paediatric cancer.
Vidayasagar said ideally, machine learning teams should work with medical doctors but these collaborations are very few.
“The clinician community is a little bit unfamiliar with mathematical methods and if you use some machine learning algorithms to [produce results], they are not necessarily willing to try some of those ideas,” he said.
“The risk factor is [higher] for them because they are the ones in the firing line. It’s a long process from the time we publish what we think is a conclusive study to the time when it might make an actual impact on the clinical practice.
“It may never make it due to reasons beyond our control … we have to be philosophical about it,” he said.