Search⌘ K
AI Features

Linear Regression

Explore the fundamentals of linear regression and how to apply it using scikit-learn. Understand fitting models to data, making predictions, and evaluating fit quality with R squared. This lesson helps you grasp when linear regression is suitable and shows how to create simple predictive models for continuous variables.

Chapter Goals:

  • Understand linear regression and its applications
  • Fit a basic linear regression model using scikit-learn
  • Make predictions and evaluate model performance

A. Introduction to linear regression

One of the main objectives in both machine learning and data science is finding an equation or distribution that best fits a given dataset. This is known as data modeling, where we create a model that uses the dataset’s features as independent variables to predict output values for some dependent variable (with minimal error).

However, it is incredibly difficult to find an optimal model for most datasets, given the amount of noise (i.e. random errors/fluctuations) in real world data.

Since finding an optimal model for a dataset is difficult, we instead try to find a good approximating distribution. In many cases, a linear model (a linear combination of the dataset’s features) can approximate the data well. The term linear regression refers to using a linear model to represent the relationship between a set of independent variables and a dependent variable.

y=ax1+bx2+cx3+dy = ax_1 + bx_2 + cx_3 + d ...