Tutorial 17: Writing custom code

Python notebooks great for a lot of tasks, as we have seen. They make it easy to run pre-made code without knowing a lot about Python and are excellent for mixing long text snippets in with code. Notebooks also make it easy to embed graphics (even interactive ones) and to share the output with others. I love using them for tutorials, notes, and the graphics and model intensive elements of a data science analysis.

However, in many situations it is better to write Python code within a .py script file. For one thing, functions in a .py file can be called across multiple notebooks. It's also easier to work with raw files once your code becomes more complex. We've already seen examples of this with the wiki.py and iplot.py files. Now, you'll be able to do the same thing.

Autoload

By default, a running Python process will only load a module (a .py file) once. It assumes that modules do not change. This can be annoying if you are running a notebook to test code that you've written in a module file. To cirucumvent that, run the following line of code (you need to run it before you import the module you want to have automatically re-loaded):

In [1]:
%load_ext autoreload
%autoreload 2

The notebook will now automatically reload the module when the file is changed.

Create a module

Now, go to the main Jupyter notebook page and create a new text file. Name the file wikitext.py. Create a function named print_hello that accepts no arguments and simply prints out the string 'Hello!'. Save the file and test that you can load the wikitext module:

In [2]:
import wikitext

Now, run the function print_hello and verify that the output is correct:

In [3]:
wikitext.print_hello()
Hello!

Test that the autoreload functionality works (it will be very annoying later if you think its working and it is not). Change the message in the file to 'Hello you!', save the file, and test the function again:

In [4]:
wikitext.print_hello()
Hello!

You can treat the text in a .py like a giant code block. Import other modules, define variables, and define functions all directly within the file.

pycodestyle

Another benefit of using .py files is that we can now use automated tools to check out code. I typically use two different tools. The first is the relatively basic pycodestyle. To run this on a file, import the module and run all of the checks on a filename. Here we will check my code for wiki.py.

In [5]:
import pycodestyle
In [6]:
pycodestyle.Checker(filename='wiki.py').check_all()
Out[6]:
0

You should see 3-4 warnings where my code does not conform to the Python standards. Try to adjust these in the script, and re-run the checker until you get rid of all the warnings. Hint: the first error on line 69 can be dealt with by replacing '+' with '\+'.

In [7]:
pycodestyle.Checker(filename='wiki.py').check_all()
Out[7]:
0

Linting with pylint

From Wikipedia:

A linter or lint refers to tools that analyze source code to flag programming errors, bugs, stylistic errors, and suspicious constructs. The term originates from a Unix utility that examined C language source code.

The standard linter in Python is called pylint. It tends to catch a lot more issues than pycodestyle. Let's run it on my wiki.py code:

In [8]:
from pylint.epylint import lint
In [9]:
lint("wiki.py")
 --------------------------------------------------------------------
 Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
 
 
Out[9]:
0

As above, try to adjust my code to conform to all of these concerns. Re-check that everything works and you get a 10/10 score. Finally, do the same thing with iplot.py (it may take a minute or two to run on this file):

In [10]:
pycodestyle.Checker(filename='iplot.py').check_all()
Out[10]:
0
In [11]:
lint("iplot.py")
 --------------------------------------------------------------------
 Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
 
 
Out[11]:
0

Try to adjust my code here as well. Don't worry if you have a few remaining warnings, but try to get close. We'll talk next time about any remaining issues.