Machine learning is the ability for a machine to “learn” and and improve from experiences without being explicitly programmed to do so by the coder. For this the machine is usually trained on some data(supervised learning) and tested on another set of data. In this post, we will be covering the important terms and keywords needed to understand machine learning.
The set of images or data that the model is trained on. It is this information that provides the entire base of the model. It is how wide-ranging and exhaustive this data is that determines how ‘accurate’ the model is.
The set of images/date that the model is tested on. This data is what the model has to predict. After the predictions are made, then only can the accuracy be calculated.
Relations between Training and Testing Data sets:
- Training Set ⋂ Testing Set = ɸ : The intersection of the training and testing sets is phi(null set). Repetition of data in the training and testing sets is a strict no-no in machine learning. It can affect the true accuracy of the model tremendously.
- Ground Truth: The labels, or the answers, of the testing or training data is called the ground truth.
- Accuracy% : Accuracy is a measurement of how correct the model is.
Accuracy%=(Number of correct predictions) ፥ (Total size of testing set) * 100
Accuracy can also be found using something called Confusion Matrix.
Say it is fruit classification problem into Guava and Apple.
|True (T)||False (F)|
There are 4 possible outputs for a model to return for each image.
- TP – Model gave apple and ground truth gave the same.
- FP – Model gave apple but ground truth gives guava.
- TN – It is not an apple and model says it is not apple.
- FN – Ground truth says it is an apple but model says it is not an apple.
Accordingly each box is incremented by +1 whenever an image falls in the above categories.
Now we can see that the only correct results would be TP and TN.
∴ Accuracy = (TP+TN) ፥ (TP+TN+FP+FN)
Types of Machine Learning
Machine Learning is of 3 kinds usually – supervised, unsupervised and reinforcement learning. In supervised learning, the model is trained on a fixed training set and then tested. Examples of supervised learning models are K-Nearest Neighbors, Decision Trees, Random Forest Classification, logistic regression etc. In unsupervised learning, there is no training set. The model has to draw outputs form the testing data without any previously learned information. Examples of unsupervised learning models are clustering algorithms such as K-Means, means shift clustering and hierarchical clustering. Reinforcement learning is about taking suitable actions to maximize rewards in particular situations.They usually include chess game models etc. that learn from a situation and improve on it the next time.Background vector created by freepik – www.freepik.com