Recommender systems: benefits and practical guidelines for software professionals
Recommender systems are a type of information filtering system that predict and recommend items that a user may be interested in. These systems are widely used in e-commerce, social media, online content platforms, and other domains where there is a large amount of content to be filtered and personalized recommendations can enhance user experience. In this blog post we will introduce recommender systems, listing their benefits and challenges, and provide some practical guidelines for software professionals interested in developing their own recommender system.
The first recommender system was created in 1979 by Elaine Rich who was looking for a way to recommend users books they might like. Her idea was to create a system that asks the users specific questions and assigns their stereotypes depending on their answers; depending on the users’ stereotype, she would then get a recommendation for a book they might like. The first actual mention of recommender systems was in a technical report as a “digital bookshelf” in 1990 by Jussi Karlgren at Columbia University, and in the early ‘90s several research groups started working extensively on personalized recommender systems. Since then, the field has grown significantly and recommender systems have become increasingly sophisticated, and today there are many different algorithms and techniques that are used to build effective and scalable recommendation systems. Some notable examples of recommender systems include Amazon’s product recommendations, Netflix’s show recommendations, and Spotify’s music recommendations.
The goal of a recommender system is to provide personalized and relevant recommendations to users based on their preferences, behavior, and context; various types of recommender systems exist, but most of them can be categorized in one of three groups:
- Collaborative filtering systems are based on the assumption that people who agreed in the past will also agree in the future; in practice, to perform a recommendation for a specific user, these systems look at “similar” users (i.e. who consumed and liked similar content) and recommend the items that were consumed by those users.
- Content-based filtering systems perform recommendations based on the content of the items themselves, without looking at similarities between different users.
- The majority of modern recommender systems use hybrid approaches, combining the strengths of various algorithms; for instance, it is possible to combine the scores of different recommendation components (e.g., one collaborative system and one content-based system) numerically, or to combine features derived from different knowledge sources and give them to a single recommendation algorithm.
Benefits
Recommendation systems have a number of benefits, and here we list some of the major contributions that they can make to help the users consuming a specific service (e.g. e-commerce, or streaming platform).
- Personalization. This is the most evident contribution of recommender systems: they can recommend content that is targeted to the specific needs and interests of each user, which can bring many benefits, as we will see in the next points.
- Increased user satisfaction. Recommender systems can help users find what they are looking for quickly and easily, as well as see new items that they might not have found otherwise, which can increase their satisfaction with the product or service.
- Increased consumption. By recommending relevant items that the users might have not seen otherwise, recommender systems can increase the activity of users on the platform (e.g. more purchases in an e-commerce, or more streams on a streaming platform).
- Improved user engagement. This is very linked to the increased user satisfaction mentioned above; by providing customized content, recommendation systems can help keep the users engaged with the service and therefore on the platform for longer periods.
- Reduced churn. Recommender systems can help reduce churn by providing users with recommendations that keep them coming back for more.
- Improved marketing. Recommender systems can help businesses improve their marketing efforts by providing them with insights into what users are interested in.
- Better Inventory Management: Recommender systems can help businesses manage their inventory more effectively by predicting which products are likely to be popular and ensuring that they are in stock.
Challenges
We have seen above that recommender systems bring many benefits to users, but there are also many challenges which are to be kept in mind while building one.
- Cold start. When a new user or item is added to the system, there is no (or very little) information about it, therefore some recommendation algorithms cannot be used. This is the reason why music streaming services, for instance, always asks for preferences as soon as a new user signs up and creates a new account, so that such (limited) information can be used to provide the initial recommendations, which will later be improved using the user’s activity on the platform.
- Sparsity. Usually, each user only consumes a small portion of the items in the catalog (e.g. movies in the library of a streaming service) and, similarly, most of the items in the catalog are consumed by a limited number of users. Therefore, if we consider the user-item matrix – which contains the information about the users’ activities – most of it will be empty (i.e., it is a sparse matrix), and recommendation algorithms have to be robust to this issue.
- Bias. This is a common problem for any machine learning algorithm, and recommendation systems are no exception. It is important to work only on user features that represent their preferences, and evaluate the recommendation algorithm to check whether it exhibits any bias (for instance leading all the users to seeing only a limited range of items).
- Changing user preferences. User preferences may vary over time, which can make it difficult for recommendation systems to keep up with the latest trends. This is a common issue in machine learning – referred to as data drift – and a possible solution is to consider only recent interactions when performing a recommendation.
- Content diversity. Recommendation systems should strive to recommend a variety of content to users. This will help to ensure that users are exposed to new things and that they do not get stuck in a filter bubble where they are only shown content that they are already familiar with.
- User trust. Users need to trust that the recommender system is providing them with accurate and unbiased recommendations. This can be done by clearly explaining how the system works, providing users with the ability to control their privacy settings, and making sure that the system does not recommend items that are inappropriate or offensive.
- Privacy concerns. Some users may be concerned about their privacy when using recommendation systems. It is important to address these concerns in order to gain user trust, possibly providing the ability to control some settings of the recommendation algorithm.
- Lack of data. This is not always an issue, as it depends on the specific recommendation system that is being used, the size of the content library, and the number of users’ interactions that are available in the dataset, but it might be. Recommendation systems need data to train and learn from, and some of them need millions of interactions to provide accurate recommendations. If there is not enough data, and the system is performing poorly, a simple solution is to move towards simpler recommendation algorithms, which require smaller amounts of data for training.
- Scalability. Recommendation systems often need to handle large amounts of data, which can be a challenge to scale.
Practical guidelines
So far, we have seen how recommender systems can improve in several ways the user experience of an online service and what are their main benefits and challenges. Now, we will move on to some practical guidelines and best practices that will be helpful to keep in mind if you decide to build your own recommender system.
Define the problem and objective. Once you have decided you want to invest time (and money) towards developing a recommender system, it is essential to precisely define the problem and the objective.
- What is the type of recommendation you want to provide?
- What kind of data do you have?
- How will you measure the success of the recommendation system?
At this stage, it is particularly important to think about how you will evaluate the success of the system. Indeed, it is common to evaluate the efficacy of recommendation systems by using some numerical metrics that measure how the users interact with the recommended content itself (e.g. number of clicks on recommended items). While this is certainly relevant, it is sometimes a fairly limited perspective of the issue: indeed – depending on the specific service, type of content, etc. – it might happen that, even if the user clicks on one of the recommended items, that specific content does not fit well within a flow of consumption of the platform and therefore the recommendation is not effective in the long term. In this sense, it is better to also evaluate the recommender systems on different metrics, that do not look directly at the recommender content but how/if the behavior of users that have been recommended some items differ from the other users: for instance, users with recommended content perform more purchases, or have longer sessions on the platform. I found an interesting blog post of how headspace evaluated the efficacy of their recommender system.
Choose the appropriate algorithm. The first thing to consider is whether you might need a collaborative approach, a content based approach, or (most likely) a hybrid one. Each algorithm has its own strengths and weaknesses, and the choice of algorithm depends on the type of data you have and the objective of your system. For example, collaborative filtering works well when you have a large number of users and items, while content-based filtering works well when you have rich item descriptions (or other representation of the items).
Crucially, it is important to have a robust approach that is capable of providing recommendations even in difficult cases, such as new users, for instance having a back-up on a purely content-based approach or a top-popular recommendation.
Another important thing to keep in mind is to start with something simple: very recent recommendation algorithms are very advanced (and generally accurate), but they are also very complex and more expensive to train and run. For this reason, it is generally a good idea to start with something simple, such as a matrix factorization algorithm, or even a most-popular, and use that as a baseline for evaluating more advanced algorithms.
Provide transparency and interpretability. This is something to consider while choosing the appropriate algorithm. Recommender systems can sometimes be black boxes, making it difficult for users to understand how recommendations are made. Providing transparency and interpretability can help build trust with users and improve the overall user experience (for example, providing explanations for the recommended items can help users understand why a particular item was recommended). Some algorithms are more transparent than others, and it might be an aspect to keep in mind when choosing the algorithm to use.
Collect and preprocess data. The dataset is the backbone of any recommender system, and it is crucial to collect and preprocess data properly. The data should be clean, accurate, and representative of the problem you are trying to solve. Data preprocessing steps may include removing duplicates, handling missing values, and normalizing data.
Train, evaluate, and tune the model. Once you have chosen the appropriate algorithm and have collected the dataset, it is time to train the recommender system. As for any other machine learning task, it is important to split the dataset in train, dev, and test set, to evaluate the generalization capabilities of the model on previously unseen data. At this stage, offline metrics are generally used for evaluating the model and tuning it (i.e., tweaking the hyperparameters to improve its performance).
Continuously keep track of the system performance. A recommender system, as any other machine learning algorithm, is never complete. It should be continuously updated and improved based on the users’ feedback and changing preferences. Monitoring the system’s performance and making adjustments when necessary is essential to ensure that the system continues to provide high-quality recommendations and is robust to common issues such as data-drift.
Conclusions
We have seen what are the benefits that can be obtained with recommender systems, and which are some of the most common challenges that you might encounter while developing one. Building an effective recommender system requires a combination of domain expertise, data science skills, and knowledge of best practices. By following the guidelines above, you can develop a recommender system that provides personalized recommendations to users, improves their engagement and satisfaction, and meets your objectives.