Futurist Ray Kurzweil predicted in 1990 that a computer would beat a human world champion chess player by 1998. In 1997, that actually happened with IBM’s Deep Blue. Since then, artificial intelligence (AI) has continued to advance rapidly, making now a good time to brush up on what is considered the next wave of highly disruptive technology.
AI consists of many sub disciplines such as natural language processing, computer vision, knowledge representation and reasoning. Machine learning executes AI in that algorithms – which are fed with big data – enable computers or machines to pick up on patterns, predict future outcomes and train themselves on how to best respond in certain situations.
The technology is making its way into a broad range of industries from marketing with behavioural targeting, to healthcare with accurate and early detection of complex diseases, to infrastructure with smarter urban planning.
In part 1 of this series, CIO looks at how some of the big players are using AI, including one of the most talked about facets of machine learning – deep learning or artificial neural networks that are made up of many hidden layers between input and output.
IBM, Baidu, Google, Facebook, Apple, Microsoft and others have invested big in AI. Local players that have also long been on the scene include DSTO, NICTA and CSIRO.
“The key to intelligence is learning,” says Alex Zelinsky, chief defence scientist at the Defence Science and Technology Organisation (DSTO). “Once we master machine learning, then you can start to have artificial intelligence. We are intelligent because we can learn; you can learn lessons from doing things and remember those lessons.”
“Very often with a computer program we are just writing a sequence of instructions for the computer to follow in order to accomplish a task,” adds Adam Coates, director of Baidu Silicon Valley AI Lab.
“The idea behind machine learning is that there are some decisions … where it’s very hard to write down the instructions, so we would like the machine to learn to make those decisions based on looking at a bunch of examples. Deep learning is a technology that has really become popular in the last few years, which is a much more powerful version of machine learning,” he says.
Facebook’s director of AI research, Yann LeCun, says that deep learning is becoming pervasive in ways that people don’t yet really realise.
“Whenever you use voice recognition on your smartphone or whenever you upload a picture on Facebook and it recognises your friends, there’s deep learning in it.”
At Facebook, LeCun’s job is to find smarter ways to match content with users’ interests. Sounds simple, but it’s actually a difficult task because it involves training machines to read and understand all kinds of unstructured data such as text, images and videos to serve up relevant content at the right time and to a diverse bunch of users.
“Doing a really good job at this requires understanding content and understanding people. Ultimately that’s an AI problem because understanding people requires intelligent machines if you want to do a really good job.
“With current machine learning techniques it’s very difficult to have a machine read a text, for example, and then remember what the text is about, the events that happened in the text, and then answer questions.”
To achieve this, the machine needs a short term memory or its own sort of hippocampus that we humans have in our brains, LeCun says.
In April this year, Facebook revealed at its F8 developer conference that its memory network can read a short term version of Lord of the Rings in 15 sentences and have it answer questions on ‘where is Frodo?’ and ‘where is the ring now?’
“It’s basically a neural net with piece of memory on the side,” says LeCun.
“If you have a machine that holds dialogue with a person, that machine has to keep a trace of all the things that were talked about or what the topic of discussion is about, figure out what the person knows and doesn’t know, and then do something for her and him. You need to keep track of all of those things and so you need a working short term memory for that.”
Facebook’s memory network answers questions by figuring out where the spoken topic appears in the text and regurgitates it. One step above that is it can find relations between objects and know geometry.
“If I step out of the room and then turn right and you ask the question ‘where should I go to meet Rebecca?’ it has to remember where you are and where did I go to and do some geometry reasoning,” LeCun says.
Speech recognition is another big area where machine learning can be applied. At Baidu, the Chinese-based search giant, the aim is to have mobile phone software accurately transcribe words in languages such as English or Mandarin and understand the request.
Today, the technology is not at a point where it’s more convenient than typing on a small keyboard, Coates says.
“Something we think is a big failing to current speech systems is that they don’t work well in noisy environments. If your phone is sitting on a table a little bit away from you in a room that has poor reverberation or you try to talk to it in a crowded café, especially if it’s not a newer cell phone that has many microphones, it really doesn’t work quite as well.
“We are trying to make the system much, much more accurate so that when you speak to your phone you can do so casually like you and I are speaking together – the phone can understand what you’ve said and give you a really good transcription.
“And if we want a lot of these new and emerging applications like Internet of Things devices to work, we really feel that speech recognition systems have to handle noisy environments much better.”
With its Deep Speech system, which first came out in December 2014, Baidu trained it on more than 100,000 hours of voice recordings, first getting people to read to the machine and then adding synthesized noise over the top to make it sound like they are talking in a noisy room, cafeteria, or car.
“We feed all of those synthetic examples to the deep neural network so that it can learn on its own, so even when I hear this person speaking in different kinds of environments they are always saying the same thing and the neural network will learn how to filter out the noise by itself.”
The other part to this equation, Coates says, is getting the software to understand complex requests.
“For instance, if I ask you to book airline tickets and I give you a very complicated set of criteria and I tell you about this using natural language, it’s quite challenging to get a system to understand this in enough detail so it can go out and do what you want it to do.”
Abuse prevention is another area where machine learning comes in handy. Robin Anil – an ex Googler who left the company this year to work on statup Tock with other former Google staff – spent a lot of time at the search giant picking up on offensive edits users made to Maps.
“You’ve probably seen Map Maker that came with the news recently that some people drew something bad on maps between Android and iOS. Those kinds of problems I dealt with.”
Machine learning and ‘trust modelling’ was practical in helping verify which user edits were true and which were false, Anil says.
“The only way we can figure that out is through the power of big data; trying to figure out if a lot of people agree that is the truth and the system figures out that is the truth. So it tries to figure out agreements between people.”
Next page: How DSTO, NICTA and IBM are using AI