Table of Contents
A few weeks ago I wrote a blog post titled Should You Learn to Program with Python . If you read that and decided the answer is yes then this post is for you.
Python is a versatile and useful programming language; It’s general purpose and allows several different styles of programming (object oriented, procedural, functional etc.), it’s a great scripting language, it runs interactively so it’s fast to develop and experiment with, it’s probably the best “glue” language, has an enormous collection of add-on modules, it’s relatively easy to learn, it has a large user base, and it’s easy to install and get started with. … so, lets install and get started!
Why use Anaconda Python?
Python is Free open-source software and so are the vast majority of modules and tools for it. If you use Linux or MacOS then you will have a default Python installed and you can use it. For Windows you have to install something. There are several options for setting up a nice Python development environment but the setup that is becoming “standard” is Continuum Analytics’ Anaconda Python. It’s well done and has a great tool for managing module packages and build environments called
conda. It has a focus on Data Analytics, Machine Learning and numerical computing. There are lots of good reasons to use it.
Anaconda is mostly a free open-source product of Continuum Analytics. Continuum offers paid subscriptions for support, extra capability and “business features”. The founders of Continuum are some of the best Python programmers and have written code for several important and widely used Python packages. They deserve support and funding for their work. Starting a business is one way to achieve that. Be aware that when you read documentation free and paid stuff will sometimes be mixed together. This is my only personal complaint and I wanted to get it out of the way up-front.
Here are some of the reasons to use Anaconda
The people behind it are great Python programmers
It is a very nice collection of the most important Python packages
Package management using
It has some optimized numerical libraries linked with the latest Intel MKL libraries
It has a Data Analytics and numerical computing focus
They support NUMFOCUS
It is a strong “ecosystem” for developers
It is available on Linux, MacOS and Windows
It is an Open Data Science platform
It is also a very good “R” platform
It by default installs to your home directory and doesn’t interfere with any other Python install you may have.
It is easy to install!
How to Install Anaconda Python
I’ll go through the Linux and Windows install. (It is also simple to install on MacOS but I won’t discus it here). After the simple install instructions I’ll give you a couple of pointers and links to get you started using your new Python.
I’m using Ubuntu 16.04 but the install procedure should be the same for any Linux version or distribution. The install is via a shell archive script. It’s trivial to install.
Point your web browser at the the Continuum downloads page https://www.continuum.io/downloads. Download the latest version of the shell archive — which was Anaconda3-4.3.1-Linux-x86_64.sh when I wrote this. If you think you will need to use Python 2.7 don’t worry about it now. The version you are installing at this point is what will be your default base. You can use any Python version you want when you setup environments for your projects. The
condatool will get the version you want for a specific project automatically.
It’s a good idea to check the sha256 hash for the file you just downloaded. (You don’t have to do this but it is good practice.) Go to https://docs.continuum.io/anaconda/hashes/ and follow the link to the version you downloaded and look at the page. You will see information about when the file was created, size and the hashes. Then generate a hash on your local machine and check that you have the same code.
That checked out OK for me.
Now you just need to run the install script. It will unpack the archive and set things up for you.
You will be asked to read a license agreement and then it will offer you a default install location and option to change that. After it finishes installing it will offer to append your PATH variable with the anaconda3/bin directory. You can exit the term session you were using and start another and you will have the new PATH variable set. You are done with the install!
The Windows install is similar to the Linux install but instead of a shell acrhive it’s a self extracting installer exe file.
Point your web browser at the the Continuum downloads page https://www.continuum.io/downloads. Download the latest version of the Windows installer — which was Anaconda3-4.3.1-Windows-x86_64.exe
After downloading just double click the installer exe and follow the prompts. It will by default install in the directory Anaconda3 in your home directory and will offer to add the anaconda bin directory to your PATH variable. Same as it did with the Linux install.
Getting started with Anaconda Python
Your initial interaction with anaconda Python will be through the terminal. Now that the anaconda directory is on your PATH, Python 3.6 should be your default. Try it,
On Windows you will have new start menu items from the anaconda Python install,
If you open the app “Anaconda Prompt” you can do the same thing I just did on Linux,
>>> symbol is the Python interpreter prompt that you can use interactively. To exit from the python session you can type
Ctrl-D in Linux or
Ctrl-Z in Windows or
quit() in either one.
On Linux you can type
ipython or in Windows click the
ipython icon and you will get an enhanced interactive Python shell that has many useful features.
You can work with Python from the command-line interactively and use a program editor to work on a Python script or module and execute it from the Python prompt. This is the traditional way of working and is simple and efficient.
You can also use Python from the command-line as a “super” calculator. That’s something I do often.
Working from the command-line is fine but there is a very powerful and popular browser based notebook interface Jupyter. This may become your main tool for interacting with Python. Jupyter evolved from ipython and is now a surprisingly useful interface for many languages (not just Python).
With Jupyter you can have documentation in Markdown, nice mathematical equations with LaTeX, graphs, plots, images, executable code and output all in the same document. I am currently reading a book that was written using Jupyter! ( “Introduction to Machine Learning with Python”, by Sarah Guido; Andreas C. Müller O’Reilly Media, Inc., 2016 )
To start Jupyter just type
jupyter notebook in Linux or click on the
Jupyter Notebook icon in Windows. You will want to spend some time reading the documentation and experimenting with Jupyter.
There is an interesting (and large!) collection of Jupyter notebooks here https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks. These links will lead to notebooks that “should” open in the notebook viewer
nbviewer so that you can read them on-line. If you find something that you want to download and open in your own copy of Jupyter to edit and work with then look for the download icon in the nbviewer to get it. If the notebook opens in a GitHub page then you will need to “git clone” or download the zip file from GitHub to get the files. Warning: Unfortunately if you try to save the file with your browser from the GitHub listing directly you will just get the html rendered version of the notebook which will not open in Jupyter and it will have the same .ipynb file extension!
I recommend you start slow, read the documentation and tutorials and experiment. Keep in mind that your default Python is the latest 3.6 version and you may run across old notebook files that may be written for Python 2.7 and you could have some trouble with them unless you open them in a 2.7 notebook session. You can do that! But you will have to install some other python versions in anaconda. Which we will look at doing in the next section.
Conda and Anaconda Navigator
One of the useful things about Anaconda Python is its tools for Python package management and project environments. The core tool for this is the command-line utility
conda. There is also a new GUI tool called “Anaconda Navigator“. I’m only going to talk about
conda since I haven’t spent much time with “Anaconda Navigator” yet. If you want to start with “Navigator” it is still a good idea to learn a little
conda first so you understand what the GUI is actually trying to do. After you are familiar with
coda are encouraged to look at “Anaconda Navigator” since it does collect several resources including useful links to documentation.
Why is Conda or “Anaconda Navigator” Important?
Package management: There are tens of thousands of Python packages (including 100’s of useful ones :-). Linux distributions will have a default Python (2 and 3) and many Python packages in their repositories. If you use these packages you are limited in variety and versions (and quality of build). They also get installed globally on your system. There is also the powerful Python pip command that you can use to install packages from PyPI. These options can be difficult to manage for specific projects also version control and, especially updates, are problematic. With
condayou can easily update your current environment (more on that in a bit…) and you can setup a work environment with control over versions used. It is a very powerful tool for installing packages and can even install a completely different version of Python than what you setup with your default install.
Environments: When you are working on a project you will need a variety of packages and possibly specific version of those packages. (Think of what you might want for a web development project vs a machine learning project). Python developers have been using virtual environments or complete virtual machines for this. There are existing tools for this like virtualenv or for virtual machine management Vagrant. These tools are useful but
condais very versatile and powerful for these tasks.
How to learn Conda
I’ll give a couple of example of using
conda but first here are some suggestion to get started.
Start with the
Get a copy of the Conda Cheat Sheet
Then read some of the Using conda documentation
That will get you started and it wont take long for you to realize how powerful and useful
A few examples of using
Add a Python 2.7 environment
I said earlier that we could add a Python 2.7 version. Here’s a way to add an environment with the packages you would have installed if you had used the Python 2.7 version of Anaconda as your default install.
Look at your current environments,
Now lets add Python 2.7
That creates a directory python2.7 in the default anaconda3/envs directory, sets the python version to 2.7 (latest) and then installs the meta-package “anaconda”. That’s all of the default Anaconda packages for Python 2.7.
Note: If you use –prefix as suggested in the “conda cheatsheet” it creates the directory and environment but doesn’t correctly add to the env list. I recommend you just use “–name”.
To use this environment,
To leave that environment,
Note: On Windows leave off the command
source i.e. just use
Install Intel Python with
Yes, it was that easy to install Intel Python!
Setup a Python Machine Learning environment
Lastly lets do an environment for some machine learning experimentation.
Ready for some data analysis and machine learning fun!
Happy computing! –dbk