Collaborative filtering

Table of contents

Automate your business at $5/day with Engati

Collaborative filtering

What is collaborative filtering?

Collaborative filtering (CF) is a technique used by recommender systems. Collaborative filtering has two senses, a narrow one and a more general one.

In the newer, narrower sense, collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue than that of a randomly chosen person. For example, a collaborative filtering recommendation system for preferences in television programming could make predictions about which television show a user should like given a partial list of that user's tastes.

In the more general sense, collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets. Collaborative filtering methods have been applied to many different kinds of data including: sensing and monitoring data, such as in mineral exploration, environmental sensing over large areas or multiple sensors; financial data, such as financial service institutions that integrate many financial sources; or in electronic commerce and web applications where the focus is on user data, etc. 

collaborative filtering
Source: ResearchGate

What are types of collaborative filtering?

User-Based Collaborative Filtering (UB-CF)

User-based filtering measures the similarity between target users and other users, using logic and recommends items by finding similar users to the active user.

Item-Based Collaborative Filtering (IB-CF)

Item-based, which measures the similarity between the items that target users rate or interact with and other items. 

Get your WhatsApp chatbot at just $5 a day

How does Netflix use collaborative filtering?

Collaborative filtering tackles the similarities between the users and items to perform recommendations. Meaning that the algorithm constantly find the relationships between the users and in-turns does the recommendations. The algorithm learn the embeddings between the users without having to tune the features. The most common technique is by performing Matrix Factorization to find the embeddings or features that makes up the interest of a particular user.

Matrix Factorization

Matrix factorization is an embedding. Say we have a user-movie matrix or feedback matrix, Aᴺᴹ, the models learns to decompose into:

An user embedding vector U, where row N is the embedding for item M.

An item embedding vector V, where row M is the embedding for item N

The embedding vector is learned such that by performing UVᵀ, an approximation of the feedback matrix, A can be formed.

Loss Function

To approximate the feedback matrix, a loss function is needed. One of the intuitive loss function is using mean squared error (MSE). MSE computes the difference in the feedback matrix A and the approximated UVᵀ matrix.

Regularization Function

One of the most common problems with training the model is overfitting. Overfitting happens because the model is trying to learn the embedding of certain features that does not contribute to the accuracy of the model. If this particular outlier feature has large ‘amplitude’ or bias, then it is said that the model is over-fitted to these particular features.

Making recommendations

Generally, the steps (and functions) are listed below:

  • Create a sparse tensor: tf.SparseTensor(), for U and V matrix with random initialisation
  • Create the loss function and optimiser: tf.losses.mean_squared_error(), to estimate the total loss with regularization penalty and SGD as the optimiser
  • Create the model: tf.Session(), Initialise hyperparams, learning rate and embeddings
  • Train the model:, to learn the embeddings of the feedback matrix and return the v and k as the embedding vector
  • Show recommendations: df.DataFrame(), to show the closest movie with respect to the user queried

How do you solve collaborative filtering?

Collaborative filtering systems have many forms, but many common systems can be reduced to two steps:

  • Look for users who share the same rating patterns with the active user (the user whom the prediction is for).
  • Use the ratings from those like-minded users found in step 1 to calculate a prediction for the active user

This falls under the category of user-based collaborative filtering. A specific application of this is the user-based Nearest Neighbor algorithm.

Alternatively, item-based collaborative filtering (users who bought x also bought y), proceeds in an item-centric manner:

  • Build an item-item matrix determining relationships between pairs of items
  • Infer the tastes of the current user by examining the matrix and matching that user's data

Is collaborative filtering supervised learning?

No, collaborative filtering is an unsupervised learning which we make predictions from ratings supplied by people. Each rows represents the ratings of movies from a person and each column indicates the ratings of a movie.

In Collaborative Filtering, we do not know the feature set before hands. Instead, we try to learn those. Just like the handwritten digit recognition MNIST, we do not know what features to extract at the beginning but eventually the program learns those latent features (edge. corner, circle) itself.

What is the difference between content based and collaborative filtering?

Content-based filtering, makes recommendations based on user preferences for product features. Collaborative filtering mimics user-to-user recommendations. It predicts users preferences as a linear, weighted combination of other user preferences.

Both methods have limitations. Content-based filtering can recommend a new item, but needs more data of user preference in order to incorporate best match. Similar, collaborative filtering needs large dataset with active users who rated a product before in order to make accurate predictions. Combination of these different recommendation systems called hybrid systems.

Close Icon
Request a Demo!
Get started on Engati with the help of a personalised demo.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
*only for sharing demo link on WhatsApp
Thanks for the information.
We will be shortly getting in touch with you.
Oops! something went wrong!
For any query reach out to us on
Close Icon
Congratulations! Your demo is recorded.

Select an option on how Engati can help you.

I am looking for a conversational AI engagement solution for the web and other channels.

I would like for a conversational AI engagement solution for WhatsApp as the primary channel

I am an e-commerce store with Shopify. I am looking for a conversational AI engagement solution for my business

I am looking to partner with Engati to build conversational AI solutions for other businesses

Close Icon
You're a step away from building your Al chatbot

How many customers do you expect to engage in a month?

Less Than 2000


More than 5000

Close Icon
Thanks for the information.

We will be shortly getting in touch with you.

Close Icon

Contact Us

Please fill in your details and we will contact you shortly.

This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
Thanks for the information.
We will be shortly getting in touch with you.
Oops! Looks like there is a problem.
Never mind, drop us a mail at