pdoc: Auto-generated Python Documentation

This short post looks at pdoc, a lightweight documentation generator for Python programs, provides an example used on the logline docs page, and explores some advantages of automatic docgen for the pragmatic programmer.

Radical simplicity

pdoc is a super simple tool for auto-generating Python program documentation.

Python package pdoc provides types, functions, and a command-line interface for accessing public documentation of Python modules, and for presenting it in a user-friendly, industry-standard open format. It is best suited for small- to medium-sized projects with tidy, hierarchical APIs.

It's a pip package, so a simple command-line installation will have you up and running in seconds. pdoc is a lightweight alternative to Sphinx with minimal cognitive overhead. If you love Markdown you'll probably love pdoc. And if you don't love it, you'll like it.

Less is more

You can output your documentation in PDF or HTML format with a simple flag:

pdoc --PDF filename
pdoc --HTML filename

The PDF option is a personal favorite because it produces a nice LaTeX document that would make Donald Knuth smile. You can even push it to the arXiv and pretend you're a physicist.

Autogenerated documentation is not exactly the Platonic ideal of literate programming, but if you have well-docstringed modules, functions, and classes, pdoc produces a user-friendly breakdown of your code into clean, modular sub-headings.

The AKQ Game

The Ace-King-Queen Game is a toy model poker game written in Python. You can check out the GitHub repo here. It's a command-line game that contains the essence of hand ranges, bluffing frequenies, and calling frequencies, in a cartoonishly simple three-card game for two players. This implementation pits the human player against an "artificially intelligent" computer player.

I used pdoc to auto-generate the HTML documentation for the game. A few tweaks and it was fit to publish on the logline website.

The Overview Effect

Per Wikipedia:

The overview effect is a cognitive shift in awareness reported by some astronauts during spaceflight, often while viewing the Earth from outer space.

On a more mundane plane, checking out your code at a high level can be clarifying, where you inspect the atomic elements of a script neatly arranged and indexed. The beauty of pdoc is how lightweight and fast it makes the docgen process. You can use it as part of an iterative development workflow — to take a periodic birdseye view of your code and survey the big picture.

Function-wise inspection of a program makes a lot of sense. If all the function names and docstrings pass the smell test after a quick run through pdoc, you're probably on the right path. Otherwise, it can help weed out faulty logic or potential missteps before they sabotage your script.

The Read! Command in Vim

Note: I discovered this command reading through @jovica's wonderful book Mastering Vim Quickly.

The Basics

You can use the read command to insert a file, or the output of a system command, into the current buffer:

  • :r file.txt will insert the file below the cursor in the current buffer.

  • :r !{command} will insert the output of {command} below the cursor in the current buffer

See the full details by typing :help read. Here is the example they chose in the official docs:

Vim help for read! command

The fun

The beauty of this command is that it helps you think about your vim buffer as a nexus or clearinghouse for all sorts of system commands.

Say you have text sitting in various files or you have hacked together a little program to OCR snapshots of text — you can use the read! command to pull that text directly into your buffer without exiting, context switching or fooling around with an involved series of clipboard commands.

Document and verify simultaneously

The r! command is very handy when you’re documenting a piece of source code. Imagine you’ve hacked together a little program called sopranos.py which simply pulls a random line from “The Sopranos” and prints it to STDOUT. If you want to write a simple How-To doc to explain how the script works, you can quote from the source code and insert examples of the program output, without exiting your editor buffer.

If you type :r! python3 sopranos.py and the program works, it will spit out a random line from the script. The simplicity of this command belies its power: it allows you to verify, as you’re writing, that the script works —otherwise it will deface your buffer with an error message.

To quote particular lines from the program, use head or tail to filter the text. For example, use :r! head -4 sopranos.py to pull in the first four lines of your script, to document the import dependencies.

The command in action

In this video I use the r! command to pull in my slug template for blogposts. This slug at the top of a markdown file contains the key metadata for generating the final HTML file as part of a static site generation workflow using Python and Jinga. A slug is a unique, human-readable identifier for a resource written in the very-cleverly-named-kebab-case.

Shell Power!

These functionalities are a first cousin to the :!{shell command} command. This allows you to call a shell command in Vim and you will be temporarily redirected to a terminal view to see the output. The beauty of the read! command is that you never even have to leave your Vim buffer to test computations, perform arithmetic or leverage the power of Unix pipelines.

OCR: Parsing a HTML file with Python

Parsing a HTML file generated by an OCR app

The script at this GitHub commit is a command-line tool written in Python, where we pass the name of the HTML file (generated by Screenotate the underlying OCR program). The Python script uses Beautiful Soup to parse the tags, a simple regex to remove the HTML tags, and then copies the relevant text to the clipboard.

#!/usr/bin/env python3
import sys
import re
import pyperclip
from bs4 import BeautifulSoup

Prerequisites: Shebangs & Imports

The script is executed from the command-line, so the first line is a shebang.

The script has 4 import statements:

  • sys allows us to take arguments from the command line
  • re provides regular expression matching operations
  • pyperclip provides copy and paste clipboard functions
  • Beautiful Soup allows us to parse HTML files

Program Flow

  1. The user passes the name of the HTML file generated by Screenotate, which is then stored in the user_file variable.

  2. The file is then run through Beautiful Soup, creating a BeautifulSoup object (where the document is represented as a nested data structure.) The find_all("pre") method finds the relevant text for the user. (Screenotate uses <pre></pre> tags to enclose the OCR'd text).

  3. We use a simple regex to find the pre tags, so they can be removed. We don't want them in our output text.

  4. Per the exhaustive docstring, the remove_tags() function removes the tags from the file generated by the OCR app, strips line breaks and copies the relevant text to the clipboard

    def remove_tags(text):
        notags = tags.sub("", text)  # Strip tags
        without_line_breaks = notags.replace("\n", " ")  # Strip line breaks
        pyperclip.copy(without_line_breaks)  # Copy text to clipboard
        print("Parsed text copied to clipboard!")
        return without_line_breaks
  5. We call the function on the stringified text parsed by Beautiful Soup

OCR: Using Tesseract from the command line

Plain text for the win

While this was a nice little exercise in Python, a little digging showed that there were a number of middlemen to be cut out. Screenotate uses the Tesseract open-source OCR engine to parse the text from your screenshot and then generates a HTML file to serve you the text. It only takes a handful of goes around that merry-go-round before it gets tiresome.

Shell script one-liner

The easiest way to cut out the middleman is to just run Tesseract over your screenshots, spitting the text into a plaintext file for easy grepping, copying and editing. The benefits are numerous: you can batch your OCR work, centralize the text in a single file and using plaintext is a cleaner, more lightweight solution.

The solution is a one-liner shell script:

for i in *.png ; do tesseract "$i" stdout >> filename.txt;  done

It’s just a for loop, the quintessence of computer automation, nothing more and nothing less. Now you can batch your OCR jobs and run everything in the terminal, in accordance with the scriptures. I chose to save the file as a zsh executable and use the alias ocr to call it. I’m now only ever 4 keystrokes away from launching a powerful optical recognition engine from the comfort of my own home.

Getting started with logline

Step 1: What kind of documentation do you need?

  1. Business docs (meeting minutes, debate summary, financial reports...)

  2. Software docs (API docs, man pages, user guides...)

  3. Scientific docs (team meeting summaries, process documentation, conference summaries...)

Step 2: Do you need your writer to attend a live session?

Ideally, we can work asynchronously to avoid any scheduling complications. This also helps keep project costs down, because live sessions are more expensive.

Now that most meetings and events are taking place using videoconferencing tools, it's even easier to work asynchronously. You just send us the recorded video, where each speaker is identified automatically. This reduces the friction of working on new projects and helps avoid attribution errors, especially for conferences and Q&A sessions.

Step 3: All about deadlines

Deadlines are the lifeblood of a writer's routine. We have a usual delivery time of 7 working days for our standard documents. If you need a rapid turnaround, when possible, this will come extra. As a small team, we have a natural project load which means that super quick turnarounds are the exception, rather than the rule.

Step 4: Fine tuning

We like to keep the lines of dialogue open during the writing process. We are available for any adjustments, addenda or reshaping that you might want to make. We are more than happy to jump on a quick video call or iterate via email for any fine tuning.

Step 5: Delivery and payment

We use a simple Stripe checkout to bill per project. You will receive a notification (via email) that your document is ready, with the link to the secure payment page. You will receive your documents once the payment is complete.

The English language as a service


The bulk of global business documentation and communications are in the English language. English use is accelerating worldwide, because it is the de facto lingua franca of science and technology. We believe that clear and compelling written communication is a bottleneck for non-native English speaking teams, founders and scientists.

The English language as substrate

The world is witnessing exponential growth in technology startups, scientific knowledge and global inter-disciplinary cooperative projects. The entire substrate of basic research and the programming languages that power the modern world is built on top of the English language.

This means that first rate English language skills are a prerequisite today for teams operating in the fields of business, science and technology. Take the example of ITER, an international nuclear fusion research and engineering megaproject, funded and run by seven member entities—the European Union, India, Japan, China, Russia, South Korea and the United States. The official language of ITER Organization is English. At the end of the day everything gets compiled into English.

More globalization means more English

As the whole world moves online, led by startups, we are seeing a new wave of globalization, further compounding the importance of excellent English communications. This raises the bar for everybody and, in particular, smaller teams. For any team looking to sell a product or service worldwide, imperfect English copy is a severe handicap.

Full-stack English engineers

We help teams of all shapes and sizes outline, draft and complete essential documentation in English. The bulk of our work involves summarising mission-critical meetings, developing a memo system and producing pristine, readable papers for your audience.

We believe in systems thinking and we look at your natural language expression throughout your activity — from your idea generation and internal memo writing, to your official documentation, marketing copy and external papers. We are full-stack English engineers!