Machine learning in a nutshell — part 2: Predicting future behaviour based on past data, with supervised learning

Customer Experience 3 February 2018

In our first article of this series dedicated to machine learning, we saw that the principle goal of machine learning is to automate common tasks, with the help of computers. To do so, machine learning algorithms try to mimic human learning based on a mathematical model (if you’re just joining us, you can catch up here).

Machine learning solutions can fall into 3 categories: supervised learning, unsupervised learning, and reinforcement learning. Let’s address the first one.

An introductory example, or why there’s no magic in machine learning

“Predicting future data” is a concept that underlies supervised learning, often misunderstood due to the word “prediction”. If someone tells you: “in my hand I have a green fruit, rather round, with a diameter of 7.2 cm, a weight of 152 grams, that has seeds and a stem”, you are likely to assume it is an apple. You have just predicted future data, based on your past observation of your parents implicitly teaching you how to identify an apple, in your childhood. Descriptors such as weight, shape, and colour are called “observable variables”, and the fact that this is an apple is the “target”. If you feed a machine learning algorithm these variables and explicitly give it the name of the fruit, this is supervised information. Then, take a fruit that you do not know the name of, gather information and give it to your model. The model will predict the most likely name based on what it has learned. But the model cannot know the unknown: if you only provide it with data about apples and pears, and then show it a banana, it will never guess “banana”! However, if you give it 10 million new apples and pears, it will give you the correct answer in just a second.

Important applications for digital marketing – if the data fits

Supervised learning can solve plenty of critical issues as long as you gather a large enough history of supervised information. This history can come from search engines (showing the best results following a specific query, bidding the right amount on a search query, etc.) and recommender systems (think Netflix recommendations), to pricing or advertising attribution modeling. For instance, you might want to determine sociodemographic information about users in order to adjust your message for a given target. Based on a large enough quantity of labeled information (real user sociodemographic data), supervised learning can learn rules between target and web navigation in order to infer the value for the rest of the population.

Supervised learning is also often used to predict a user’s action: will she/he buy a product in the near future? What is her/his probability of churn? Looking at a long purchase or churn history, an algorithm could learn the rule that explains the target, but only if meaningful information were available. Indeed, if 90% of users churn because of a hotline service, but the hotline service is not recorded, it is logical that the algorithm will not be able to achieve a good performance.

Thus, before dedicating months of work to a machine learning project, remember that there’s no magic to it: the algorithm learns rules to link variables to a target, just like a human would do. Start thinking about how you could try to explain the target of interest, and check if the variables you would use are available. If not, try to gather this information before spending too much time exploring your machine learning model.

Want to learn more?
Stay tuned, unsupervised learning and reinforcement learning will be addressed in upcoming articles!

Would you like another cup of tea?