Which Machine Learning course is right for you?
Data School offers four Machine Learning courses:
- Course 1: Introduction to Machine Learning with scikit-learn
- Course 2: 50 scikit-learn tips
- Course 3: Building an Effective Machine Learning Workflow with scikit-learn
- Course 4: Machine Learning with Text in Python
Keep reading to find out which course is right for you!
Course 1: Introduction to Machine Learning with scikit-learn
This is the perfect course for you if:
- You're brand new to Machine Learning
- You have Machine Learning experience, but you're new to scikit-learn
- You've used scikit-learn, but you don't really know if you're doing things the "right" way
Topics covered:
- What is Machine Learning?
- Why use scikit-learn?
- Installing scikit-learn & Jupyter notebook
- Jupyter notebook basics
- Machine Learning terminology
- Machine Learning workflow
- Loading a dataset using pandas
- Preprocessing categorical features
- Model training & prediction
- Regression with Linear Regression
- Classification with KNN & Logistic Regression
- Model evaluation with train/test split & cross-validation
- Metrics for regression & classification
- Hyperparameter tuning with grid search & randomized search
Length: 4 hours
Cost: FREE
Course includes: Jupyter notebooks with detailed lesson notes, interactive quizzes, 80+ resources to help you deepen your understanding of course topics, certificate of completion, lifetime course access
>>> Click here to enroll in the course <<<
"This was a FANTASTIC video series. You are very easy to follow and this was the first resource I found that really walked through the Python language basics in terms of Machine Learning. Also, this really helped me understand the documentation on scikit-learn so that I can apply it to more complicated models." - Robin B.
Course 2: 50 scikit-learn tips
This is the perfect course for you if:
- You've taken my introductory ML course and you're ready to go deeper into scikit-learn
- You want to work more efficiently using scikit-learn's latest features
- You want to learn best practices for Machine Learning code
- You learn best through short, focused lessons
Topics covered:
- How to build, evaluate, and tune a Pipeline
- Two easy ways to visualize a decision tree
- How to benefit from missing values using a "missing indicator"
- How to plot an ROC curve in one line of code
- How to speed up a grid search
- How to add feature selection to a Pipeline
- Why you should use scikit-learn (not pandas) for preprocessing
- How to create an interactive diagram of a Pipeline
- How to save your best Pipeline for future predictions
- Why dropping a level when one-hot encoding is usually a bad idea
- How to create custom transformers for feature engineering
- Why you should use stratified sampling with train/test split
- How to build and tune an ensemble of models
- Why you should try ordinal encoding with tree-based models
- And much, much more!
Length: 3 hours
Cost: FREE
Course includes: Jupyter notebooks, certificate of completion, lifetime course access
>>> Click here to enroll in the course <<<
"Your new videos are great! I find them as excellent and concise refreshers on ML implementation topics." - Neil Dias, ML Engineer
Course 3: Building an Effective Machine Learning Workflow with scikit-learn
This is the perfect course for you if:
- You've taken my introductory ML course and you're ready to go deeper into scikit-learn
- You want to write efficient, readable, and reusable scikit-learn code that integrates well with pandas
- You want to properly handle common data issues such as missing values, text data, and categorical data
- You want to tune your entire workflow for maximum performance
- You want to take advantage of the latest scikit-learn features
Topics covered:
- Review of the basic Machine Learning workflow
- Encoding categorical features (
OneHotEncoder
,OrdinalEncoder
) - Encoding text data (
CountVectorizer
) - Handling missing values (
SimpleImputer
,KNNImputer
,IterativeImputer
) - Creating an efficient workflow for preprocessing and model building (
Pipeline
,ColumnTransformer
) - Tuning your workflow for maximum performance (
GridSearchCV, RandomizedSearchCV
) - Avoiding data leakage
- Proper model evaluation
- Model persistence (
pickle
,joblib
) - Feature selection (
SelectPercentile
,SelectFromModel
) - Feature standardization (
StandardScaler
,MaxAbsScaler
) - Feature engineering using custom transformers (
FunctionTransformer
)
Length: 8 hours
Cost: $99
Course includes: Jupyter notebooks with detailed lesson notes, certificate of completion, lifetime course access, 30-day refund policy
>>> Click here to enroll in the course <<<
"I've already used the learnings from the course in a Machine Learning competition and got impressive results, while keeping the code clean and easy to understand. Also, I'm much more confident at tackling Machine Learning problems and I'm sure this will contribute a lot to my career." - JoΓ£o VΓtor Franco, Data Scientist
Course 4: Machine Learning with Text in Python
This is the perfect course for you if:
- You've taken my introductory ML course and you're ready to apply what you learned
- You want to solve supervised Machine Learning problems using text-based data
- You want to learn Natural Language Processing techniques that you can adapt to your own datasets
- You want to work through data science problems from start to finish
- You learn best through extensive practice
Topics covered:
- What is Natural Language Processing (NLP)?
- NLP terminology
- Feature extraction from unstructured text
- Modifications to basic tokenization
- Document summarization
- Sentiment analysis
- Advanced text processing with regular expressions
- Data exploration & visualization with pandas
- Feature engineering with pandas
- Classification with Naive Bayes & Logistic Regression
- Proper model evaluation
- Classification metrics
- Multi-class classification
- Pipeline tuning with grid search & randomized search
- Model ensembling & stacking
Length: 14 hours
Cost: $299
Course includes: Jupyter notebooks with detailed lesson notes, substantial practice projects with provided solutions, 100+ resources to help you deepen your understanding of course topics, certificate of completion, lifetime course access, 30-day refund policy
>>> Click here to enroll in the course <<<
"You won't find a better course to learn about NLP and Machine Learning in Python anywhere else! Kevin has a way of making difficult topics very accessible and understandable. I was able to quickly apply much of the theory and code regarding NLP and Machine Learning from this course to my own job." - Cliff Baker, Statistician
Frequently Asked Questions
What are the main differences between these courses?
Introduction to Machine Learning with scikit-learn: The goal of this 4-hour course is to introduce you to the basic Machine Learning process and how to implement it using scikit-learn. It teaches you the most important concepts in-depth so that you will be prepared to execute simple Machine Learning projects with clean datasets.
50 scikit-learn tips: The goal of this 3-hour course is to help you write better scikit-learn code. Through 50 short lessons, it teaches you how to work more efficiently using scikit-learn's latest features and apply Machine Learning best practices.
Building an Effective Machine Learning Workflow with scikit-learn: The goal of this 8-hour course is to help you work extremely efficiently in scikit-learn with complex, real-world datasets. It teaches you how to move your entire Machine Learning workflow (including data preprocessing, feature engineering, and feature selection) into a scikit-learn pipeline in order to maximize model performance and eliminate data leakage. The course is completely up-to-date (released April 2020) and can be applied to any supervised learning problem using any scikit-learn model.
Machine Learning with Text in Python: The goal of this 14-hour course is to help you build excellent Machine Learning models when the input data is text. It teaches you a variety of Natural Language Processing and text processing techniques that support this goal. It walks you through data science problems from start to finish, making use of scikit-learn, pandas, seaborn, TextBlob, and other Python libraries. It includes substantial practice projects with provided solutions, as well as 100+ resources for digging deeper into the course topics.
In what order should I take these courses?
You should start with my beginner course, Introduction to Machine Learning with scikit-learn. After that, you will be ready to take my other three courses, and you can take them in any order.
How much overlap is there between these courses?
There is a small overlap of material (5-10%) between these courses. Many students have taken all four of my courses and have been quite satisfied!
Do you offer any discounts?
Yes! I offer a Purchasing Power Parity discount to make my courses more affordable to people living in the developing world. I also offer discounts to full-time students and to anyone experiencing a significant financial hardship due to the pandemic. Please email me and I'd be happy to send you the appropriate discount code.
I have another question...
Please email me and I'd be happy to help!