Starting a Python Project with Anaconda

It just so happens that on a few systems I have been using Anaconda to allow painless Python coding. For example, on Windows or non-Debian Linux I have struggled to compile packages from source. Anaconda provides a useful wrapper for the main functionality that just works on these operating systems (on my Ubuntu machine or the Raspberry Pi I just use virtualenv and pip in the usual way).

Anaconda also has the advantage of being a quick shortcut to install Python and a bucketful of useful libraries for big data and artificial intelligence experimentation. To start head over to the download page for Anaconda here. The installer is wrapper in a bash script – just download, verify and run. On my ten-year-old laptop running Puppy Linux (which was in the loft for a year or so covered in woodlouse excrement) this simply worked painlessly. No compiling from source. No version errors. No messing with pip. Previously, libraries like numpy or nltk had been a headache to install.

I find that Jupyter (formerly iPython) notebooks are a great way to iteratively code. You can test out ideas block by block, shift stuff around, output and document all in the same tool. You can also easily export to HTML with one click (hence this post). To start a notebook having installed Anaconda run the following:

jupyter notebook

This will start the notebook server on your local machine and open your browser. By default the notebooks are served at localhost:8888. To access across a local network use the -ip flag with your IP address (e.g. -ip 192.168.1.2) and then point your browser at [your-ip]:8888 (use -p to change the port).

My usual way of working is to play around with my code in a Jupyter notebook before officially starting a project. I find notebooks a better way to initially iteratively test and develop algorithms than coding and testing on the command line.

Once I have some outline functionality in a notebook it is time to create a new project. My workflow for this is as follows:

  1. Create a new empty repository on GitHub, with a Python .gitignore file, a basic ReadMe file and an MIT License.
  2. Clone the new empty repository into my local projects directory. I have set up SSH keys so this just involves:
     git clone git@github.com:[username]/[repositoryname].git 
  3.  Change directory into the newly cloned project directory:
     cd [repositoryname] 
  4. Create a new Conda environment. Conda is the command line package manager element of Anaconda. This page is great for working out the Conda commands equivalent to virtualenv and pip commands.
     conda create --name [repositoryname] python
  5. Activate new environment:
     source activate [repositoryname] 
  6. Create requirements.txt file:
     conda list --export > requirements.txt 
  7. Install required libraries (you can take these from your Jupyter notebook imports, e.g.:
     conda install nltk 
  8. Create a new .py file for your main program, move across your functions from your notebook and perform a first git commit and sync with GitHub.
    git add . 
    git commit -m "First Commit" 
    git push origin master 

Hey presto. You are ready to start developing your project.

Advertisements