Top 8 resources for learning data analysis with pandas
I recently launched a video series about "pandas", a popular Python library for data analysis, manipulation, and visualization. But for those of you who want to learn pandas and prefer the written word, I've compiled my list of recommended resources:
Intro to pandas data structures: This is the first post in Greg Reda's classic three-part pandas tutorial (part 2, part 3). It's highly readable, presents the "right" level of detail for a pandas beginner, and includes lots of useful examples.
Introduction to Pandas / Data Wrangling with Pandas / Plotting and Visualization in Python: Three extremely long (but well-written) Jupyter notebooks from Chris Fonnesbeck's Advanced Statistical Computing course at Vanderbilt University (my alma mater!). If you want to go deep into the details and learn about many powerful pandas features, these notebooks are for you.
Python for Data Analysis: This book was written by the creator of pandas, Wes McKinney, back in 2012. It covers IPython, NumPy, and pandas, and also includes an excellent appendix of "Python Language Essentials". It's still probably the best pandas book out there, though it might be worth waiting to buy until late 2017 when the second edition is released. (Wes is currently accepting suggestions for the book!)
Common Excel Tasks Demonstrated in Pandas: If you're coming from an Excel background, this post (and part 2) may help you to build a mental model for how pandas "thinks". It's from Chris Moffitt's excellent blog, Practical Business Python.
Translating SQL to pandas: This Jupyter notebook from Greg Reda may be helpful if you are transitioning from SQL to pandas. (Here's the related video presentation.)
Modern Pandas: This is a recent seven-part series by Tom Augspurger, a contributor to pandas, primarily targeting intermediate pandas users who want to make their code more modern and idiomatic.
If you prefer reading code snippets (rather than articles or books) to learn a language, you might like Mark Graph's 10-page Cheat sheet to the pandas DataFrame object or Chris Albon's Data Wrangling code samples.
All of the code from my pandas video series is available for you to browse, in a well-commented Jupyter notebook.
What excellent pandas resources did I miss? Let me know in the comments section below!
P.S. Want to be the first to know when I launch an online course about pandas? Subscribe to the Data School newsletter.
Top 8 resources for learning #Python pandas: https://t.co/jIyGuuqCsy featuring @wesmckinn @fonnesbeck @gjreda @TomAugspurger @chrisalbon ...— Kevin Markham (@justmarkham) May 17, 2016