Installing Python Packages for DSE200x with pip

Python for Data Science is an introductory course that provides an overview of various tooling that exists in the python world that is useful for data science purposes.

This includes things like:

  • Jupyter Notebooks
  • The numpy library
  • The scipy library
  • The pandas library
  • The matplotlib library

The course provides an excellent overview to python, and suggests using anaconda which is a distribution of python geared toward data science purposes. Although this is a great way to get started with python if you have never used it before, installing multiple versions of python (which this approach would do) can be quite a pain to manage long term. This is especially true if you use python for other purposes such as web development with flask or django.

If you try to work through some of the jupyter notebooks that are presented in the course without installing anaconda, you will often see error messages like this:

ModuleNotFoundError Traceback (most recent call last)
1 get_ipython().run_line_magic('matplotlib', 'inline')
2 import numpy as np
----> 3 from scipy import misc
4 import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'scipy'

The solution is of course to install the package that is missing. In the example above we can install the missing package with pip3 install scipy.

You can use the one liner below to install all of the required packages for the course in one go. Note, this assumes you already have python3 and pip3 installed on your computer.

pip3 install jupyter pandas numpy scipy matplotlib imageio folium
