Predicting a customer's next purchase using graph neural networks

Part I

5 min readJun 4, 2023

Background.

Making data-driven decisions in organizations is becoming the norm rather than the exception in this technological era, given its benefits (Stobierski, 2019). With the recent boom in AI and machine learning, businesses are seeking to identify ways they can use machine learning on their data to gain a competitive advantage over their competitors. As predicted by (Woodruff, 1997), one such advantage comes from customers. Consequently, staying ahead in business boils down to the ability to understand your customers (Kahn et al., 2022).

Understanding a customer’s behavior and purchase decisions could enable businesses to predict what a customer is most likely to buy. This, in turn, has numerous advantages ranging from stock optimization to revenue predictions, both of which inform an organization’s ability to strategically plan their finances.

In this article, we explore the problem of predicting a customer’s next item purchase and implement a solution to the problem using a relatively new technology, graph neural networks (Scarselli et al., 2009). We go through the intuition of the solution and why it is a more robust solution than traditional machine learning methods that exist. The end goal is that, given previous sales data, we should be able to predict the items and quantities of the items that a customer is most likely to buy.

Introduction.

Many underlying relationships among data can be represented as graphs (Scarselli et al., 2009). Graphs are particularly useful for modeling interactions between entities. In our context, we have two entities, customers and products, that interact through the purchase of a product by a customer. For every product that a customer buys, we link the customer with that product. If a customer buys five products, five links are created from the customer to each product. To make it easier to visualize, let's put customers on one side and all the products we have on the opposite side.

Bi-partite graph of customers and products

What we then end up with is called a bipartite graph (Wu et al. 2013). Bi (two), as there are only two entities involved.

The technical term for these entities (customer and product) is nodes. A node, usually has features describing it e.g. The customer can have a name, age, gender etc. Typically, we will choose features that give us insight into who the customer is and could possibly influence their purchasing decision. Additionally, the features of a product that we choose should be those that could possibly influence a customer's purchase decision. These can be description of the product, brand etc. This is called feature selection and is most often dependent on your intuition and experience.

The link between a node and another node is called an edge. Edges usually describe an action e.g. A customer buys a product. Edges also have features that describe them, for example, how much of a product the customer bought, discount on the purchase. All these features, both node and edge, are useful in helping the machine learning model learn how different nodes interact with each other.

Our machine learning tool of choice for this problem is graph neural networks, a special type of neural network deployed on top of graphs. This article assumes you already know the fundamentals of neural networks and deep learning. If you do not, please refer to this article.

Our machine learning task will be link level prediction, where we attempt to predict a link between a customer and a product given previous sales data.

A huge problem for the traditional machine learning methods for recommendation systems (for which our solution falls under) is the cold start problem (Lendave,2021). This is where there is not sufficient data about a customer or item to predict if they will buy a product or not. This is particularly common with methods such as collaborative filtering in recommendation systems. This means we are not able to correctly predict what a new customer will buy, and this is one of the problems we tackle using graph neural networks. Let’s get a good understanding of the intuition behind the solution and how we tackle the cold start problem.

Intuition.

Graph Neural Networks (GNNs) leverage message passing (Zhong et al., 2021) as a diffusion mechanism. In our scenario, we have a graph where customer nodes are connected to the products they have purchased. Through message passing, the customer nodes collect information from the product nodes and incorporate it into their own representations. Consequently, each product node obtains information about every customer who bought it.

By aggregating information from the products they have purchased, the customer nodes create representations that reflect their purchase history, including the products they bought and the circumstances in which they bought them. Similarly, the product nodes aggregate information about the customers who bought them. These representations encode valuable patterns and relationships between customers and products.

To predict whether a customer will buy a specific product, we can employ a prediction mechanism such as a dot product between the customer node and the product node representations. The dot product yields a probability score, indicating the likelihood of the customer purchasing the product. A higher score suggests a higher probability of purchase, while a lower score suggests a lower probability.

It is important to note that the actual implementation of the prediction process may involve additional steps, such as applying activation functions, utilizing trainable parameters, and employing appropriate loss functions for model training. The dot product serves as a component within the broader framework of a GNN-based prediction model.

Leveraging graphs' ability to capture complex relationships between nodes and their interactions through message passing, we can provide a good basis to solve the cold start problem. Since each product node collects information from the customers that purchased it, it is able to learn what types of customers typically buy the product. If we, therefore, perform a dot product between a new customer’s feature vectors and the product feature vectors, the similarities. if any, between customers that bought the product and the new customer play a role in increasing the probability of the customer buying the product.

Conclusion.

In this article, we explored the problem of predicting a customer’s next purchase, the benefits behind it, and the intuition behind our solution. We also discussed the benefits of using GNNs to solve the cold start problem. Stay tuned for part two, where we will implement this solution practically.

About Phindor

The Phindor business assistant is a data engine specifically made to fit your business needs. We use artificial intelligence to process and interpret data to enable you to make meaningful and impactful decisions. We use advanced AI algorithms to carefully curate and present data relevant to your business to help you know who and where your customers are with high levels of accuracy.