These days, the most used branch in AI is Deep Learning. We can consider Deep Learning as a subset of Machine Learning, where the used algorithms should "learn" by themselves. Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) are the State of the Art in terms of development and business implementation.
In Machine Learning (ML) the programming paradigm is a little bit different from the conventional. A "usual" programmer analyzes the inputs and develops the program to get the expected outputs. For instance, we want to create a program that sums two numbers. The programmer evaluates the type of inputs and outputs and then writes the code to transform the input values into the desired outputs.
When working on ML, we give the inputs and outputs to the system, and the algorithm "builds" the program itself. Thus, in ML programming paradigm is the algorithm that creates the program for us. We represent the data as features and is the network that "builds" the connections between the data and understands how data is related. However, we still need to develop the ML model/algorithm and just then it can "learn" and create the adjusted program to the given data. An ANN is a neural network where all neurons are connected as we can see in Figure 1.
Each connection between neurons has a weight and a bias value. The neuron itself has a weight too, being the weighted sum of all connections to it plus a bias value. When the data is used to train the ANN model, the weights are adjusted according to the optimizer, each activation function used on each layer of the model, and the input features.
We can have n number of hidden layers, thereby, we are increasing the complexity of the model and giving to it more capacity to better understand the training data. But be careful, many hidden layers could lead to overfitting. Overfitting is when the model is so adjusted to the training data that it only has good results with that data. We need models able to handle new data, achieving high accuracy results, not overfitted ones.
In each layer, we apply an activation function to the neurons. An activation function calculates the weighted sum of all connections for the neuron and adds a bias value. Then the neuron is activated or not, according to the activation function. Each time the batch of training data go through the entire network, a process called Epoch, we have outputs. Then we have a cost function telling us how far from the correct value are the model's outputs. Please remember that we provide the inputs and the expected outputs (Supervised Learning). Therefore, at the end of each epoch, the model's outputs are compared with the real values given by us. Based on the cost function, all the model's weights are updated by Backpropagation. Later I will post about the adjustment of the model with more details as well as how to choose the best features to train a model.
We can have n number of hidden layers, thereby, we are increasing the complexity of the model and giving to it more capacity to better understand the training data. But be careful, many hidden layers could lead to overfitting. Overfitting is when the model is so adjusted to the training data that it only has good results with that data. We need models able to handle new data, achieving high accuracy results, not overfitted ones.
In each layer, we apply an activation function to the neurons. An activation function calculates the weighted sum of all connections for the neuron and adds a bias value. Then the neuron is activated or not, according to the activation function. Each time the batch of training data go through the entire network, a process called Epoch, we have outputs. Then we have a cost function telling us how far from the correct value are the model's outputs. Please remember that we provide the inputs and the expected outputs (Supervised Learning). Therefore, at the end of each epoch, the model's outputs are compared with the real values given by us. Based on the cost function, all the model's weights are updated by Backpropagation. Later I will post about the adjustment of the model with more details as well as how to choose the best features to train a model.
Another well-known type of neural networks is CNNs. They are a special case of ANNs. CNN has four distinct steps/phases in his model (Convolution, Polling, Flattening, and Fully Connected layers), as we can see in Figure 2.
Figure 2 - CNN - image from freecodecamp |
A convolution layer applies a filter to an input resulting in one activation. Applying the same filter n times to an input results in the called feature map (a map of activations), indicating the detected features, for instance, of a photo (Figure 3). Thus, in the convolution layers, the CNN model applies filters to extract the best features to use in the fully connected layers of its model.
Figure 3 - Applying a filter to an input, resulting in the feature map - image from freecodecamp |
The pooling layers are used to reduce the spatial dimension of the found activations. We can apply different pooling types to our feature maps. In Figure 4, we can see the Max Pooling being applied, where we divide the feature map into 2x2 squares and keep with the maximum value.
Figure 4 - Applying the Max Pooling technique to a feature map - image from computersciencewiki |
After a convolutional layer, we always add a pooling layer, and if the model has several convolutional layers (CL) each one will be followed by a pooling layer (PL) (CL, PL, CL, PL, CL, PL,...). Then we apply a flattening layer (FL) to reduce the spatial dimension again, so we can feed the fully connected layers. The last part of the CNN model is an ANN and uses the data resulting from the CL, PL and FL layers as its input to train with. Thanks to the convolutional layers, the CNN can "easily" categorize a picture, or identify objects inside of it.
Deep Learning is widely used and its use is still increasing drastically. Now we have lots and lots of data to train our models leading to neural networks able to classify, forecast, and predict quickly and with high accuracy values. CNNs can give us amazing results in image and video classification and object recognition. On the other hand, ANNs can also give us amazing classification systems using numerical data.
Different models of CNNs and ANNs are being applied and we are facing constant improvements in the Deep Learning field. Some systems can identify millions of different objects. What about the future? Do you think there will be systems that can identify any human-known object? How fascinating is that? And how scary could it be if we think on the evil side?
Comments
Post a Comment