Machine Learning, Shallow Learning, Deep Learning, you definitely have encountered these data-relative buzzwords. But what do they mean? Do not struggle anymore, we explain everything! Some parts of this article are more technical, but there is no need to understand every detail to get familiar with these concepts.
Artificial Intelligence (AI) is commonly defined as “The theory and development of computer systems able to perform tasks normally requiring human intelligence” (Oxford Dictionary). In fact, AI refers to a particular form of intelligence related to the etymology of the word “intelligence”, namely the ability to make a choice within a given context. For a more philosophical approach to Artificial Intelligence and its limitations, do not hesitate to read our post about Dataism.
There are several forms of Artificial Intelligence, more or less complex and adapted to various problematics. The simplest models of Artificial Intelligence are algorithmic: rules are created, defining an action of the machine in a given situation, e.g. “10 cm from a wall, turn 90° to the left ”. For example, this type of Artificial Intelligence allows for a car to follow the white lines of a road. However, such a car would not know how to react when there are no more white lines, or if there are potholes.
This is the main limitation of algorithmic Artificial Intelligences: the machine is not able to respond if it encounters an unknown situation that was never faced before.
In order to develop machines that are able to respond to unprecedented situations, another type of Artificial Intelligence is used: Machine Learning.
Machine Learning consists in creating mathematical rules that are used by a machine in order to learn how to make a decision in a given context. In extenso, Machine Learning includes the techniques allowing for training Artificial Intelligences. Thanks to the learning process, the machine will be able to make a decision even in a new situation.
There are two types of Machine Learning: Shallow Learning and Deep Learning. The difference between both learning methods lies in the fact that in Shallow Learning, the decision is made according to precise criteria set by the person who created the Artificial Intelligence. In Deep Learning, informations are injected in the machine without any specification on which elements should be taken into account. In this case, the Artificial Intelligence will give an answer according to its own experience. The strength of Deep Learning can also be its weakness: the machine can discover unsuspected patterns in order to solve a problem, but it is not possible to know how the machine made its choices.
As a comparison, let’s take the example of a real estate agent for the price estimation of a house. He can do the maths according to the house area, the number of rooms, the lot size, the proximity of an urban center, etc., and thus calculate an estimation. Otherwise, he can use his instinct, his experience. He knows that this house is worth more than the other, but will not nessecarily be able to rationnaly explain his evaluation. The first case would be a Shallow Learning-like intelligence, whereas the second case would be a Deep Learning-like intelligence.
A real-world example of Deep Learning is the new generation of autonomous cars. What about the learning phase for these cars? Sensors are placed on a huge amount of vehicles driven by humans. Thus, millions of driving kilometers are registered. Finally, the machine will reproduce the driving of a human being, without any imposed rule, and the parameters taken into account for the machine’s driving decisions are unknown.
We will now have a look into the functionning of these two types of Machine Learnings.
As already explained, in Shallow Learning the machine is told which parameters should be considered for the decision. Known features, measurable variables are used in order to train the AI.
In order to explain how Shallow Learning works, let’s take an example: one wants to train an AI to predict the number of people wearing gloves according to the temperature. Assuming that the colder it gets, the higher the number of people wearing gloves. In this case, the Shallow Learning will consist in understanding the nature of the relation between the temperature and the number of people wearing gloves. Assuming that the relation is linear (meaning that if it gets X times colder, there will be X times more people wearing gloves), then:
number of people wearing gloves = X * temperature
In this case, the model is “number of people wearing gloves = X * temperature” and the learning of the machine will consist in finding the value X.
Maybe this relation is not linear, but of another type. The person in charge of parametrizing the Shallow Learning will have to determine the type of relation between the number of people wearing gloves and the temperature. The machine will try to come as close as possible to the reality, but will probably not find the exact function. This is what is called uncertainty. Indeed, mathematical models are not 100 % reliable, this is why the weather forecast is sometimes wrong. On the other hand, the advantage of Machine Learning over human intuition, is that the level of uncertainty is known. When questioned, the AI will answer: “ at 2°C, 120 persons (more or less 10) are wearing gloves”.
Deep Learning and neural networks
Deep Learning is build on Artificial Neural Networks (ANN), a type of Artificial Intelligence that has emerged in the 1980’s, with researches such as those from Yann LeCun.
Artificial Neural Networks are inspired by the biological nervous system: many neurons receive informations on one side (input) and translate them into a response on the other side (output), that allows for making a decision. For example “my hand touches a hob (input) and I feel that it is hot (output), so I remove my hand (decision)”.
In an Artificial Neural Network, there is a dataset representing a certain amount of basic informations, such as speed, temperature, age, etc. These are the inputs. In the first layer of artificial neurons, each neuron receives all inputs, but each with a certain weight, and combines these data with a certain mathematical function (addition, substraction, multiplication, etc.). This function is called “combination function”. Each neuron contains also the same “activation function”. This function determines the threshold at which the neuron is activated, and in this case returns the value (if not activated, the neuron will not return any value). If there are several layers of neurons, each layer receives inputs from the previous neural layer, and the process is the same.
In Deep Learning, even if the parameters taken into account by the machine for its decision-making remain unknown, the person creating the AI tells it how it should learn, by parametrizing the weight of each link as well as the combination and activation functions. Actually, the human tells the machine which parameters it should take into consideration, but not how it should take them into consideration.
In order to train a neural network, random weights for each link are generated, and an example of a training set is submitted. The neural network calculates and compares its output value to the expected result. It gives a feedback in order to adjust the weight of the links in order to get a better result. The amount of links increases exponentially according to the amount of neurons: the more neural layers the network contains, the more steps are needed for the feedback agorithm in order to optimize the weight of each link.
The neural network alone detects which characteristics it needs in order to answer to the question.
This is why Deep Learning is very powerful for image recognition: for the recognition of animal pictures, there is no need for explaining to the machine the characteristics of each animal. Besides, would you be able to explain the characteristic differences between a cat and a dog? Probably not, you just have seen a lot of cats and dogs and you are able to dinstinguish them, even if you do not know exactly how they are different. The same goes for a Deep Learning AI: it is trained with a certain amount of images (for which it is told if there are cats or dogs) and the AI will learn by itself the characteristics that will allow it to distinguish between these animals. Based on these self-taught characteristics, it will be able to determine which animal is present on all the pictures that will be shown to machine afterwards.
The use of Artificial Neural Networks significantly increased since 2010, thanks to the encreasing calculation power of modern computers. Today, desktop computers are able to run simple ANNs.
We went through 4 key-concepts for Big Data and data analytics:
- Artificial Intelligence (AI) : informatic software system allowing for making a decision according to a context. Algorithmic Artificial Intelligence is not able to make a decision in a new context.
- Machine Learning : techniques allowing for training AIs to make a decision in a given context, even in an unprecedented situation.
- Shallow Learning : a family of Machine Learning techniques in which the variables that must be taken into account by the machine for its decision are known.
- Deep learning : a family of Machine Learning techniques using Artificial Neural Networks, and in which the AI discovers by itself the important features for its decision-making. It is mostly impossible to know which features are taken into consideration.