When it comes to the uses of machine learning, classification algorithms play a very important role in the products we use in our day-to-day lives. Classification algorithms are used for making a distinction between different categories based on certain data which can then be used to predict results by classifying them into those distinct categories. Did it go over your head? no worries! cause today we will learn about one of them and that is Logistic Regression.
Let's learn through an example
Suppose, you work in an insurance company and the dataset you have tells you about the people who have and haven't taken the health insurance policy based on their age. Now as per our dataset, we can only have the prediction values be 1 or 0(yes or no).
Now, given the data points, the logistic regression algorithm gives us the curve shown above. This curve is called the sigmoid curve which is based on the sigmoid function (a mathematical function that returns a value between 0 and 1 for any value). Now, as per this curve, we can predict the value for a random value that is shown in the curve(i.e 35 and 45).
This mathematical equation on which this plot is based is -
ln(p/(1-p)) = b0 + b1X1
there is no Y in the above formula as we only predict the probability p for a value of X1, which is the input feature Age(in yrs). Now, since we have our curve, the logistic regression algorithm will define a line (the yellow dotted line) that will decide whether a person will buy the insurance or not. If the data point is above 50% then that person will be considered a potential buyer of the policy(i.e YES) and if the probability is less than 50% for a person then that person will be classified as a person who won't buy the policy(i.e NO).
So, it's an interesting concept using which we can make amazing projects or even make important decisions such as classifying a person to have heart disease or not. This article is just a surface explanation of the logistic regression method, and surely there's more to come from my side on this. But make sure that you understand the intuition behind it.
On a Final note
The logistic Regression algorithm is considered a linear classifier as the predictions are based on the linear combination of the input features(just check the formula again). And one more thing, we can have multiple input features which obviously would be more suitable for an insurance company and give us a higher dimensional plot.
If you didn't understand something or anything, I encourage you to google it or YouTube it. Seriously, this is great practice ;).
If you have reached the end of this article, then I feel most obliged. Hope my article has brought something of value to you ;). Feel free to like or not like this article. And pls share your thought as comments, if you have any.
If you are a fan of newsletters, you can subscribe to my newsletter, given below.
You can also follow me on Hashnode and other platforms if you like my writing.
Help your friends out by sharing this article. Cheers!