10 tips for getting started with machine learning
- 12 September, 2017 20:00
Machine learning (ML) is fast becoming a litmus test for forward-thinking CIOs. Companies that fail to adopt machine learning for product development or business operations risk falling behind more nimble competitors in the coming decade. That's according to Dan Olley, who as the CTO of Elsevier, the scientific and health information unit of RELX Group, has ratcheted up his organisation's adoption of ML technologies in recent years.
"I fundamentally believe that we are at a tipping point with machine learning and it's going to change the way we interact with the digital world over the next decade," Olley told an audience of his peers last month at the CIO100 Symposium in Colorado Springs, Colo. "We're going to have decisions increasingly made by machines."
It's a reasonable assumption. Growth in computing power, the increasing sophistication of algorithms and training models and a seemingly unlimited source of data have facilitated significant innovations in artificial intelligence (AI). AI, which includes any technology in which a machine can mimic the behavior of the human mind, includes subfields such as ML, in which statistical-based algorithms automate knowledge engineering. Google, Amazon, Baidu and others are pouring more money into AI and ML. Moreover, entrepreneurial activity unleashed by these developments drew three times as much investment in 2016 — between $26 billion and $39 billion — as it did the previous three years, according to McKinsey Global Institute.
The time to adopt AI and ML is now
AI adoption outside of the tech sector is mostly at an early, experimental stage, with few firms deploying it at scale, McKinsey reports. Companies that have not yet adopted AI technology at scale or as a core part of their business are unsure of the returns they can expect on such investments, according to McKinsey. But Olley, whose ML efforts at Elsevier have helped pharmaceutical clients discover drugs and deliver relevant medical information to clinicians, said use cases for ML abound in talent management, sales and marketing, customer support, and other areas.
[ Cut through the hype with our practical guide to machine learning in business and find out whether your organization is truly ready for taking on artificial intelligence projects. | Get the latest insights with our CIO Daily newsletter. ]
CIOs had better get up to speed on these emerging technologies if they want to establish a competitive advantage or at least stay ahead of the curve. "It's something that you have to start embarking on now," Olley said.
How do organizations who have never seen AI algorithms embark on data science or ML? Olley and Gartner offer the following tips.
1. Understand where data science fits
You have an idea for leveraging data science and ML at your organization, but how do you go about implementing it? First, you needn’t centralize your data science and ML operations. In fact, it may make sense to embed data science and machine learning into every department, including sales, marketing, HR and finance. Olley suggested CIOs try something that works for him at Elsevier, where he pairs data scientists with software engineers or oncology specialists, who build products in agile squads inspired by the Spotify model.
“We've built our data science teams into our product management teams and business units but we bring them together as a chapter and have one person lead that,” Olley said. “We do put the data scientists as close to the problem as we possibly can because we think that's the way to scale across the organization better.”
2. Get started
You needn't have a five-point plan for building a data science enterprise nor a framework to construct a polished ML product. Gartner says you should foster small experiments in different business areas with particular AI technologies for learning purposes, not ROI. "If you haven't yet I thoroughly recommend that you get started," Olley said. "Your competitors are."
3. Treat your data as if it's money
With data serving as the fuel for any AI/ML efforts, CIOs must treat their data like it's money by managing it, protecting it and obsessing over it. "Your CFO wouldn't just let the accounts be spread all over the company," Olley said. "Nor would he or she say, 'I think we've got about this much in revenue this year.'"
4. Stop looking for purple squirrels
Data scientists tend to be people who have high aptitude in math and statistics and are skilled at finding insights in data, not necessarily software engineers that can write algorithms and craft products. This is easier said than done as companies often seek unicorn-like candidates who are master statisticians, ninja software engineers and masters in an industry domain, such as health care or financial services, Olley said. "I heard one person describe it as, 'I want a software engineer with a Ph.D. in mathematics who is also a trained clinician and if they have a specialty in oncology that would be really useful,'" Olley said, wryly adding that he knows "those three people."
5. Build a data science training curriculum
Not everyone who practices data science is going to be a data scientist or require a black belt in the craft. "You're not going to find enough of these people so you're going to have work out how to train them,” Olley said, noting that he has a person responsible for “upskilling” his IT staff in data science. Elsevier has also leveraged Coursera for help. Olley at least recommends that CIOs create a refresher course in probability and statistics, with a final exam candidates must pass to prove their mettle. Gartner advises you to identify AI knowledge and talent gaps and develop a training and hiring plan to build out your capabilities.
6. Endorse data science and ML platforms
Companies getting up to speed with AI and ML or that are uncertain about how to tackle a data science problem can dump their data in data science platforms such as Kaggle. There teams of data scientists, statisticians, quants, software programmers and others who love tackling tough problems gather to compete on corporate business challenges.
7. Watch out for "derived data"
If you are going to share your algorithms with a partner understand that they are seeing your data. He said that doesn’t sit well for informatics companies like Elsevier, which is keen on protecting its data, which it views as a competitive advantage. “Your data is the new currency,” Olley said. “You must understand strategically what you want to keep and what you're happy to share and treat it like money."
8. Don't always try to solve the whole problem
A health-care organization could try to build an algorithm that replaces all primary care physicians, who are hard to see without an appointment scheduled far in advance. Or it could solve a piece of the problem by writing an algorithm that at least can discern whether a person just needs an aspirin versus more serious treatment. As Olley said: "Solve the little bits of the problem. Get more data. Build over time."
9. Don’t overthink your data models
It’s more important to get the right training sets than perfect the data models. Don’t turn just anyone loose with data, which can lead to bad data models really quickly, Olley said. “The biggest challenge is showing people the art of the possible and really freeing them up to think about what this stuff can do ... and then scaling that out."
10. Educate the CEO and board about AI
So you’re data science pilots show promise. As CIOs, you should look to promote AI and ML as a means to influence the CEO's strategy for its potential to disrupt markets and remake existing business models, according to Gartner. After all, successful machine learning operations may be the key to your organization’s future.