What are conda, Anaconda, and Miniconda? 🐍

If you've ever taken one of my data science courses, you've probably noticed that I frequently recommend the Anaconda distribution of Python.

You might be left wondering:

  • What is the Anaconda distribution, and why do people recommend it?
  • How is it related to conda?
  • How is it related to Miniconda?
  • As a Data Scientist, which of these do I need to be familiar with?

I'll answer those questions below! 👇

What is Anaconda?

Anaconda is a Python distribution aimed at Data Scientists that includes 250+ packages (with easy access to 7,500+ additional packages). Its value proposition is that you can download it (for free) and "everything just works." It's available for Mac, Windows, and Linux.

A new Anaconda distribution is released a few times a year. Within each distribution, the versions of the included packages have all been tested to work together.

If you visit the installation page for many data science packages (such as pandas), they recommend Anaconda because it makes installation easy!

What is conda?

conda is an open source package and environment manager that comes with Anaconda.

As a package manager, you can use conda to install, update, and remove packages and their "dependencies" (the packages they depend upon):

  • If Anaconda doesn't include a package that you need, you use conda to download and install it.
  • If Anaconda doesn't have the version of a package you need, you use conda to update it.

As an environment manager, you can use conda to manage virtual environments:

  • If you're not familiar with virtual environments, they allow you to maintain isolated environments with different packages and versions of those packages.
  • conda is an alternative to virtualenv, pipenv, and other related tools.

conda has a few huge advantages over other tools:

  • It's a single tool to learn, rather than using multiple tools to manage packages, environments, and Python versions.
  • Package installation is predictably easy because you're installing pre-compiled binaries.
  • Unlike pip, you never need to build from source code, which can be especially difficult for some data science packages.
  • You can use conda with languages other than Python.

What is Miniconda?

Miniconda is a Python distribution that only includes Python, conda, their dependencies, and a few other useful packages.

Miniconda is a great choice if you prefer to only install the packages you need, and you're sufficiently familiar with conda. (Here's how to choose between Anaconda and Miniconda.)

Summary:

  • Anaconda and Miniconda are both Python distributions.
  • Anaconda includes hundreds of packages, whereas Miniconda includes just a few.
  • conda is an open source tool that comes with both Anaconda and Miniconda, and it functions as both a package manager and an environment manager.

Personally, I make extensive use of conda for creating environments and installing packages. And since I'm comfortable with conda, I much prefer Miniconda over Anaconda.

Do you have questions about conda, Anaconda, or Miniconda? Let me know in the comments section below! 👇