A new study has shown how robots can learn and adapt quickly in their environments through natural language processing alone.
The study, Learning to Interpret Natural Language Commands through Human-Robot Dialog, was released as part of the International Joint Conference on Artificial Intelligence in Argentina this week.
The researchers of the study developed a dialogue agent for a mobile robot that can be put straight into a workplace environment and quickly learn to carry out delivery and navigation tasks to assist human workers without having to first be trained on a large corpus of annotated data.
The agent automatically induces training examples from conversations it has with humans, using a semantic parser to incrementally learn the meaning of previously unseen words. This can also account for new language variation, allowing for flexibility in how users say their requests.
“This approach is more robust than keyword search and requires little initial data,” researchers from University of Texas at Austin wrote in their paper.
“Further, it could be deployed in any context where robots are given high-level goals in natural language.
“To the best of our knowledge, our agent is the first to employ incremental learning of a semantic parser from conversations on a mobile robot.”
The agent is also capable of multi-entity reasoning when doing navigation tasks such as finding a person’s office by noting it is next to another person’s office.
The University of Washington Semantic Parsing Framework was used for mapping a natural language request to meaning representations. Lambda calculus (λ-calculus) formulas were used to represent the meanings of words (lexical items) and combinatory categorial grammar (CCG) was used to tag each word with a syntactic category, including a template-based GENLEX (lexical generation procedure).
More than 300 users interacted with the agent via Amazon Mechanical Turk web interface and 20 users via a simple Segbot robot on wheels in an office. It first performed poorly but then quickly learned and adapted to improve at carrying out tasks.
The agent first clarifies with the user what he or she means by a request, in which it prompts the user to ask the question another way so it can learn different ways of saying the same thing. At first this can be a monotonous interaction, but it soon picks up, meaning the key is in its ability to self-learn on the go without having to undertake extensive training beforehand.
“One users’ conversation began ‘please report to room 3418’, which the agent could not parse because of the new word ‘report’. The agent understood the re-worded request ‘go to room 3418’, and the former sentence was paired with the logical form of this latter for training.
“When the GENLEX procedure explored possible semantic meanings for ‘report’, it found a valid parse with the meaning of ‘go’ … and added it to the parser’s lexicon.”
The agent is also able to associate words such as ‘bring and ‘deliver’, and ‘Java’ and ‘coffee’, when doing a delivering task. It also asks the user if he or she is satisfied with the request so it learns how to handle requests correctly.
The agent can also learn to make the connection between someone’s nickname and their actual name such as ‘Frannie’ for Frances and ‘Bob’ for Robert, so that users do not always have to refer to a person by their full name when talking to the agent.
Users were asked to fill in a survey, rating the interaction they had from 0-4. Users rated whether they strongly disagreed (0), somewhat disagreed (1), neutral (2), somewhat agreed (3) and strongly agreed (4) to statements such as ‘robot understood me’, ‘robot frustrated me’, ‘I would use the robot to get items for myself and others’, and the like.
Looking at the results from the in-office Segbot experiment, the agent on its first day scored an average of 1.6 for understanding users, a 2.5 for frustrating users, and a 1.6 for usefulness in delivery tasks . When the agent had been in the office for four days interacting with humans, the figures came out better – a 2.9 for understanding requests, a 1.5 for frustrating users, and a 2.5 for usefulness in delivery.
The percentage of navigation tasks it completed over the four days didn’t differ from 90 per cent, but delivery tasks completed improved from 20 per cent to 60 per cent.
The agent slightly dropped in performance over the four days in navigation usefulness and in users easily understanding tasks, according to user surveys.
The agent also corrects users when they make grammatical errors, with one user pointing this out in his/her survey.
Future work on the agent includes applying it to speech recognition software, with the researchers exploring if it can automatically learn to correct consistent speech recognition errors. At the moment, users type requests through the mobile robot and web-based interface.
“As the robot platform gains access to more tasks, such as manipulation of items, doors, and light-switches via an arm attachment, we will scale the agent to learn the language users employ in that larger goal space.
“We also plan to add agent perception, so that some predicates can be associated with perceptual classifiers, and new predicates can be discovered for new words.”