Image Classification Problem Formulation

embed (81).svg

Input Image (x): This is the raw data that you want the neural network to classify. In this case, it's a picture of a bird.
Neural Network (f(x)): The image is fed into a neural network. This is a complex model made up of layers of neurons that processes the input data. The neural network has been trained on a dataset of images that are already labeled (for example, pictures labeled as either 'cat', 'bird', or 'dog'). The network learns to recognize patterns and features from these training examples.
Prediction (ŷ): After processing the image, the neural network outputs a vector of probabilities, which are the model's predictions for each class it has been trained to recognize. The vector contains a probability for each class, representing the model's confidence that the input image belongs to that class. In the diagram, the network predicts the following probabilities:
- Cat: 0.15
- Bird: 0.75
- Dog: 0.10
The class with the highest probability is considered the model's prediction. In this case, the neural network predicts that the image is of a bird, with a probability of 0.75.
Label (y): This is the true label of the input image. The label is a vector of the same length as the prediction, where the position corresponding to the correct class is '1', and all others are '0'. For this image, the true label is 'bird', so the vector is:
- Cat: 0
- Bird: 1
- Dog: 0
Cross Entropy: This is a loss function used to measure the performance of the classification model. It compares the predicted probability distribution (ŷ) with the true distribution (y), and the result is a scalar value that quantifies the error of the prediction. A lower cross-entropy value means that the prediction is closer to the true label.
Error (scalar): The cross-entropy calculation results in a single number that represents the error or loss for this particular prediction. This value is used during the training of the neural network to update and adjust the weights of the neurons, with the goal of minimizing this error in future predictions.