February 20, 2015 · machine learning Python

A friendly introduction to linear regression (using Python)

A few weeks ago, I taught a 3-hour lesson introducing linear regression to my data science class. It's not the fanciest machine learning technique, but it is a crucial technique to learn for many reasons:

The most accessible (yet thorough) introduction to linear regression that I've found is Chapter 3 of An Introduction to Statistical Learning (ISL) by Hastie & Tibshirani. Their examples are crystal clear and the material is presented in a logical fashion, but it covers a lot more detail than I wanted to present in class. As well, their code is written in R, and my data science class is taught in Python.

My Jupyter Notebook on linear regression

When teaching this material, I essentially condensed ISL chapter 3 into a single Jupyter Notebook, focusing on the points that I consider to be most important and adding a lot of practical advice. As well, I wrote all of the code in Python, using both Statsmodels and scikit-learn to implement linear regression.

Click here to view the Jupyter Notebook.

Table of contents

Here is a detailed list of topics covered in the Notebook:


If you would like to go deeper into linear regression, here are a few resources I would suggest:

If you liked this Notebook, here are some other Data School resources that might interest you:

Do you have any questions about linear regression in Python? Please let me know in the comments below!

Comments powered by Disqus