Regressions in Machine Learning


Logistic Regression

Logistic regression is a type of supervised machine learning algorithm that can predict whether an outcome variable (e.g., a binary classification) will take a specific value or not. In this case, we are trying to predict if a patient has cancer or not. This model uses the data from patients who have been diagnosed with cancer and those who haven't. We use logistic regression to determine if a patient has cancer based on their symptoms.


Linear Regression 

Linear regression is a technique that allows us to analyze data sets containing continuous variables and find trends. These trends are often linear. Logistic Regression (Logit): Logistic regression is a binary classification technique that can model non-linear relationships between categorical predictors and target outcomes. This video compares these two techniques using a simple dataset generated from a survey. We can see that LR tends to have better results than Logit when dealing with real world problems.


Decision Tree

A decision tree is a simple predictive model that classifies instances into two classes using recursive partitioning. Each node represents a test on some feature, and each leaf represents a possible label. At each step, the best split on some feature is chosen among all features, and then the left and right child nodes are recursively grown until they become leaves. The prediction at any given node is determined by the majority vote of the samples falling into its terminal nodes.


Random Forest

Random forest is a collection of many decision trees. Each tree is constructed independently, and at training time, bootstrap aggregating is used to combine the results of multiple trees. This technique helps reduce overfitting and improve generalization performance.


Polynomial Regression

Polynomials are a type of polynomial function that can be used to model non-linear relationships between several variables. In other words, polynomials can approximate any non-linear relationship. This makes them suitable for modelling data sets where some variables have a non-linear effect on others. They are also useful for modelling data sets with a lot of noise since they are robust against outliers. However, unlike linear models, polynomials cannot handle categorical or ordinal data types.


Multiple linear regression


What is multiple linear regression? Multiple Linear Regression (MLR) is a statistical method used to build predictive models based on multiple variables. In MLR, we assume that there are multiple independent factors influencing a dependent variable (eg. a person's income). These factors might include age, education, geographical location, and others. We then use the data to fit mathematical equations to predict the value of the dependent variable given multiple values of the independent variables.


The MLR model is a type of statistical modeling algorithm that is primarily used to analyze data sets where the dependent variable has multiple levels. A typical example would be predicting the success of someone’s college application based on their high school grades and extracurricular activities. We can predict whether or not they will graduate from college. However, if we want to know exactly what grade they should get in order to have the best chance of graduating, then we need to use the multiple linear regression (MLR) model.



Regression Analysis in Machine Learning


Regression Analysis is used to determine whether a relationship exists between two variables that are measured at different points in time. Regression models are used to study relationships between continuous predictor (independent) and outcome (dependent) variables.


These models can predict the value of the dependent variable from the values of the independent variables. Linear regression model, Logistic regression model, Poisson regression model etc., have been developed over decades for various applications. In this project we use linear regression model to find out how much temperature affects the size of the crops grown by farmers in south India.


This is a quantitative measure of how climate change impacts agricultural production. We are using meteorological data obtained from weather stations maintained by Indian Meteorological Department (IMD). IMD provides daily records of rainfall, maximum and minimum temperatures and humidity for these stations to researchers around the world.


Regression analysis is a statistical method that attempts to determine if there is a strong correlation between two variables. One variable is called the dependent variable and the other is called the independent variable. The data set is then analysed using multiple mathematical formulas to predict the value of the dependent variable given values of the independent variable.


There are many different ways to conduct this kind of analysis but the simplest way is probably through what we call linear regression. If we assume there is no interaction between the independent and dependent variables, then we can use the equation below to find out how much each variable affects the dependent variable (the predicted value).


y m x + b


Here y stands for the predicted value, m represents slope, and b represents the intercept. The output of these calculations is known as a regression plot.


It is possible to extend this equation further to include quadratic terms, cubic terms, etc. However, it is usually best to keep things simple and stick with a linear model unless there is particular reason to do otherwise.


Backward elimination in machine learning


Backward elimination is a method of variable selection in statistics that works by removing from consideration those variables that contribute least to the accuracy of model predictions. This approach is used primarily in regression analysis and classification and clustering algorithms. In statistics, the term backward elimination refers to performing the removal of variables, starting at the end of the list of predictors. There are many methods for doing this, including forward stepwise regression (also known as forward selection), backwards stepwise regression (or backward elimination), best subset regression, ridge regression, partial least squares regression, principal component regression, multiple linear regression, and multi-layer perceptrons.


Backward Elimination Methods

The following approaches are some of the most common techniques employed for backward elimination:


Forward Stepwise Regression - In this method, each predictor is added sequentially until no additional contribution can be achieved.


Ridge Regression - Adds small penalties to parameters that are not significantly different than zero, and thus encourage smaller coefficients to be shrunk towards zero.


Partial Least Squares Regression - A form of multivariate regression where only the components of the X matrix corresponding to the predictive variables are estimated.


Machine Learning Course in Vizag