machine learning problems

Machine Learning Problems

This topic is much broader than the previous one on “What is Machine Learning?”. However, as mentioned, our goal is to break down this complex field into simpler concepts. In this article, we’ll only touch on the most important aspects of machine learning problems and its categories. As we progress through the learning roadmap, we will delve deeper into each of these concepts. For now, focus on understanding the basics of machine learning and its categories, and don’t worry about the finer details just yet. This is just the beginning of your journey!

Supervised learning

In supervised learning, the machine is trained on labeled data, which means input and output pair is known. And then machine learns to map the input with correct output (Understanding relationship with data and labels.)

Classification

Type of supervised learning problems where an algorithm is trained on labeled dataset to predict the class (category) of unseen new data.

Example problem: a classification model trained on a dataset of images labeled as either cars or bikes and then used to predict the class (category) of unseen new images of cars or bikes based on their features.

Binary Classification problems – when you have only two option what a specific thing looks like

  • Is this image a photo of car or bike?
  • Is this email spam or not spam?
  • Is the person having heart disease or not?

Multi-Class Classification – when you have three or more options what a specific thing looks like

  • Color of the traffic light
  • Breed of a dog
  • Brand of a car

Multi-label Classification – when there are two or more options and the specific thing may look like none of the options or all of them at the same time

  • What topic is this article about
  • Traffic signs contain in the image
rb 57428
Image designed by Freepik
Evaluation Metrics – Evaluation metrics give a clear picture of how well a machine learning model is performing. Without them, there’s no way to quantify accuracy and effectiveness of that model.
Classification Evaluation Metrics
Confusion matrix

This summarizes the performance of a machine learning model. It means displaying accurate and inaccurate instances based on the model

  • True positive – correctly predicted a positive output
  • True negative – correctly predicted a negative output
  • False Positive – incorrectly predicted a positive outcome (type 1 error)
  • False Negative – incorrectly predicted a negative outcome (type 2 error)
Accuracy

Out of 10 examples how many is correct? This is calculated by taking the ratio of correct predictions to the total number of samples. By doing this we can get a quick snapshot of how well a machine learning model perform.

F1 Score (Combination of precision and recall)
  • Precision – Of the positive predictions, what proportion are correct. True positives / (True positives + False positives). A model with 1.0 precision has no false positives.
  • Recall- Ratio of true positive predictions (sensitivity). True positives / (True positives + False Negatives). A model with 1.0 recall has no false negatives.
  • Precision/ Recall connection – Increasing precision reduce recall and increasing recall reduce precision. You will have to decide whether less          false negatives or less false positives are better for a mode. By plotting the precision verses recall at different     classification thresholds allows us to choose a precision value at a given recall value.
ROC Curve/Area Under Curve (AUC)

ROC Curve/Area Under Curve (AUC) – The Receiver Operating Characteristic (ROC) curve is a common way to evaluate binary classifies. It plots the false positives against true positives (Recall). The Area Under the curve (AUC) is the space underneath the ROC curve. A perfect binary classifier will have an UAC of 1.0.

If you don’t have many positive examples or if you care more about false positives than false negatives, you should go for the Precision/Recall curve. And if not, you can use ROC/AUC.

Regression

Type of supervised learning problems where an algorithm is trained to model the relationship between the dependent target and one or more independent features to predict continuous values.

Example problems:

  • How much will bitcoin be worth tomorrow using how the market moves
  • Predicting a house price using given features
machine learning problems
Image designed by Freepik
Regression Evaluation Metrics
  1. R^2 (r- squared) – How well the prediction model fits? A perfect model scores 1.0
  2. MSE (Mean Square Error) – Makes outliers stand out more. Use if being 10% off is more than twice as bad as being 5% off.
  3. MAE (Mean Absolute Error) – All errors are on the same scale. If trying to predict 10, predicting 9 is the same error as predicting 11.

Sequence to Sequence

The sequence to Sequence model is a kind of machine learning model that takes sequential data as input and generates also sequential data as output. These are typically supervised learning problems.

Example problem: translate the given sequence of English language to sequence of French language.

Unsupervised learning

In unsupervised learning, the machine is trained on unlabeled data, which means there is no fixed output or output is not known. Then machine learns from the data discovers the relationship and patterns of data and gives the output.

Clustering

Type of unsupervised learning grouping data points based on their similarities with each other.

Example problem: Grouping customer behaviors. You have unlabeled data (customer age, income, expenses) and predict the output (high spenders, low spenders) base on the similarities of expenses and income and etc..

Dimensionality Reduction

Type of unsupervised learning reducing the number of unnecessary features in a dataset while retaining only the most relevant information.

Example problem: Need to decide if a person have heart disease or not . It is better to get rid of information of address, height, weight and focus only on age, symptoms. To reduce the inputs there is a technique called principal component analysis (PCA)

Transfer learning

Transfer learning is taking knowledge from already pre-trained model and use it in your own, but related task.

Example problem: A pre-trained model to recognize dogs can be used to our new task of recognizing cats relatively easily

Reinforcement Learning

In reinforcement learning am algorithm performs actions in an environment and is rewarded or penalized based on whether the actions were favorable or not.

Example problem: A chess playing algorithm, the environment is a virtual chessboard and actions are moving pieces. If a piece moved into a wrong position the algorithm get penalized and if it takes an opponent’s piece it gets rewarded.

The strategy an algorithm(agent) learns based on the actions it gets rewarded for is called a policy.

19585
Image designed by Freepik

Conclusion

We have discussed the categories of machine learning problems in this article, ranging from supervised and unsupervised learning to more complex kinds like reinforcement learning and transfer learning. Understanding these categories is essential as it enables you to determine the most appropriate method for resolving a specific issue, be it outcome prediction, data clustering, or empowering agents to make choices. The following article will walk you through the full machine learning process as we continue to go much deeper into the subject, assisting you in understanding how these methods work together to create models that are applicable in the real world. Keep checking back and See you there!

Leave a Reply

Your email address will not be published. Required fields are marked *