January 25, 2018 · Python tutorial

9 new pandas updates that will save you time

Since launching my Python pandas video series in 2016, there have been 10 new releases of the pandas library, including hundreds of new features, bug fixes, and API changes. I was running version 0.18 at the time, but I've finally upgraded to version 0.22 (the latest release).

Whenever I update a Python library, I want to know two things:

  1. What are the new features I might want to use?
  2. What are the API changes I need to know about?

To figure this out, I spent a few hours skimming through the pandas release notes, compiling a list of changes that might be of interest to me and my students. Then I recorded two new pandas videos explaining these changes, so that you can benefit from my research!

Video: 4 new time-saving tricks in pandas

  1. Create a datetime column from a DataFrame (1:18)
  2. Create a category column during file reading (4:24)
  3. Convert the data type of multiple columns at once (7:45)
  4. Apply multiple aggregations on a Series or DataFrame (9:48)
  5. Bonus: Download the official pandas cheat sheet (13:14)

Note: Click on a timestamp to jump directly to that point in the video, or open the Jupyter notebook if you just want to read the code.

Video: 5 new changes in pandas you need to know about

  1. ix has been deprecated (1:19)
  2. Aliases have been added for isnull and notnull (7:04)
  3. drop now accepts "index" and "columns" keywords (10:37)
  4. rename and reindex now accept "axis" keyword (13:35)
  5. Ordered categories must be specified independent of the data (16:18)

Note: Click on a timestamp to jump directly to that point in the video, or open the Jupyter notebook if you just want to read the code.

If you're relatively new to pandas, these two videos might not make a lot of sense, so I would suggest starting with my complete pandas video series instead.

If you want a longer introduction to any of the topics covered above, check out these related videos:

If you need to figure out what version of pandas you're running, simply import pandas as pd, then run pd.__version__ (note the two double underscores).

If you want to update pandas, you can use conda if you're running the Anaconda distribution, or pip otherwise.

Thanks for watching! If you have any questions, I'd love to hear from you in the comments section below.

P.S. Want to be the first to know when I release new data science tutorials? Subscribe to the Data School newsletter.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pocket
Comments powered by Disqus