Recommender systems are old news for Amazon, one of the world’s largest online retailers, but that doesn’t mean it has figured out all the answers to personalisation yet.
Speaking at the Big Data Summit 2015 in Sydney today, Rajeev Rastogi, director of machine learning at Amazon, talked about a new predictive model for more personalised recommendations.
Rastogi said that one of the challenges in displaying personalised recommendations is that the person browsing through products on Amazon.com may not be the same person whose account is logged in.
“There are often multiple users sharing an account, so there’s a problem of multiple personas,” he said.
“A lot of the time recommendations are made without knowing which persona is browsing.
“So what we want to do is based on the session conclude which persona is browsing and make recommendations for that persona.”
Another challenge is trying to make recommendations for new users, who don’t have a long history of purchases to be draw on.
“There are ‘cold start’ scenarios where they are new users and items. So here what we want to do is bridge user and items features.”
To tackle these challenges, Amazon came up with a PRLFM model. It combines personas with a novel regression-based latent factor model (RLFM) that was proposed by Deepak Agarwal and Bee-Chung Chen in their 2009 paper for the Proceedings of the 15th ACM SIGKDD.
When it comes to handling multiple user personas, the model creates a separate latent (or inferred) factor for each persona (k) of ‘user i’ or Uik. Each item also has its own latent factor (Vj). A separate latent variable (z) is created to capture the user’s persona for each preference rating (r) or rij.
For ‘cold starts’ the model looks at user i features such as demographics, location and browser history and item j features such as title, brand and price category, and applies a Gaussian distribution method to get the user latent factor and item latent factor.
This can be calculated as:
User latent factor: uik ~ N(Akxi, σ2u|)
Item latent factor: vj ~ N(Byj, σ2v|)
Rastogi tested the model on a real world dataset, ‘MovieLens’, containing 100,000 ratings of 1682 movies by 943 users.
When compared with other state of the art models such k-Nearest Neighbors for similar recommendation tasks, the PRLFM model outperformed most.
It received the second best root mean square error out of nine other models, coming after Sigmoid User Asymmetric Fact or Model, which is a non-linear model.
“This model is promising. We are going to experiment with this in terms of trying to recognise personas when users are in the middle of sessions and then do recommendations for those personas. We are hoping to launch this sometime in the future,” Rastogi said.